Present Architecture and Demo
Filter Plugin clockRegExp String Replacerfix misspoken wordsspeak chat emoticonsremove extraneous markupXML TransformerXHTML to SSML (rich speak) heartrequires xsltprocOASIS to SSML lampTalker ChooserSentence Boundary Detector (SBD) clockreduces time-to-speakpermits advance/rewindprogress feedback to appsSSML-awareonly used for Text JobsSynth PluginFestivalFliteHadifixEposFreeTTSCommand(Future) lampCepstralIBM ViaVoice questionmarkDECTalk questionmark.wav file clockStretcher (sox)faster/slower speechPlayer PluginaRts hook-greenALSA exclamationmarkdevice contention (dmix)GStreamer smiley-sadversion issuesslow to pause/stopaKode smiley-sadno pause function(Future) lampJackNASNMMKDEMMno GUI exclamationmarkTalkersa synth configured by ..synthesizerlanguage exclamationmark language attribute gets highest matching priority. It would make no sense to send Spanish text to a German voice.voicegendervolumerateauto configurationidentified by TalkerCodes Example of full Talker Code:<voice lang="en" name="kal" gender="male"/><prosody volume="soft" rate="fast"/><kttsd synthesizer="Festival" />or without the SSML tags:lang="en" name="kal" gender="male" volume="soft" rate="fast" synthesizer="Festival"apps specify desired talker attributes and KTTSD picks closest matching Talker For example, if KMouth sends dcop kttsd KSpeech sayMessage "Hello" "de"KTTSD will find a talker that speaks German. If no German talkers are configured, it will use the first talker in the configured list.message types Text spoken by KTTS is of four possible types:Screen Reader Output.Warnings.MessagesText JobsThe type is determined by the application that sends the text to KTTS.Screen Reader Output has the highest priority. It is reserved for use by Screen Reader applications. Screen Reader Output preempts all other messages, causing those jobs to pause. Once the Screen Reader Output has been spoken, the preempted messages will automatically resume. Warnings are the next highest priority. It is reserved for high-priority messages, such as "CPU is over-heating." A Warning will preempt Messages and regular text, causing those jobs to pause. Once the Warning has been spoken, the preempted messages will automatically resume.Messages are the next highest priority. A Message will preempt regular text jobs. KMouth is an example of an application that uses Messages. For example, while reading out long text from a web page, KMouth can be used to greet someone who walks into the room. KDE Notifications are also Messages.The rest are ordinary Text Jobs. Any job you initiate from the Jobs tab is a Text Job. KSayit is an example of an application that uses Text Jobs. Text Jobs are intended for longer speech output that is not urgent.Screen Reader OutputWarningsMessagesText Jobs(Future) DBUS lampDCOP Applications communicate with KTTSD via DCOP. The API is documented in kspeech.h file in kdelibs/interfaces/kspeech.Example command-line dcop call:dcop kttsd KSpeech sayText "Hello World" "gender='female'"An example from KMouth (manual marshalling):bool kttsdSay (const QString &text, const QString &language) { DCOPClient *client = kapp->dcopClient(); QByteArray data; QCString replyType; QByteArray replyData; QDataStream arg(data, IO_WriteOnly); arg << text << language; return client->call("kttsd", "KSpeech", "sayWarning(QString,QString)", data, replyType, replyData, true);}An example from KTTSMgr (using kspeech_stub to do the marshalling):void KttsJobMgrPart::slot_speak_clipboard(){ // Get the clipboard object. QClipboard *cb = kapp->clipboard(); // Copy text from the clipboard. QString text; QMimeSource* data = cb->data(); if (data) { if (data->provides("text/html")) { if (supportsMarkup(NULL, KSpeech::mtHtml)) { QByteArray d = data->encodedData("text/html"); text = QString(d); } } if (data->provides("text/ssml")) { if (supportsMarkup(NULL, KSpeech::mtSsml)) { QByteArray d = data->encodedData("text/ssml"); text = QString(d); } } } if (text.isEmpty()) text = cb->text(); // Speak it. if ( !text.isEmpty() ) { uint jobNum = setText(text, NULL); startText(jobNum); }}KDE AppsKTTSMgr KTTSMgr is the GUI for controlling and configuring KTTSD. From KTTSD's viewpoint, KTTSMgr is just another application. The only difference is that KTTSMgr writes to the kttsdrc file and then tells KTTSD to re-read its configuration.KTTSMgr provides tabs for: Starting and stopping KTTSD Configuring Talkers (a Talker is a configured synth plugin) Specifying how to speak KNotify events Configuring filters Picking an audio output plugin Specifying messages or sounds to emit whenever a text job is interrupted by a higher priority message.It also provides a Job manager that can pause, advance, rewind, or discard text jobs, or change the Talker that will speak a job. The Job Manager also provides Speak Clipboard and Speak File buttons.KTTSMgr can live in the system tray and there's an option to automatically start it in the tray whenever speaking.KTTSMgr is also a KCModule and therefore is available in the KDE Control Center under Accessibility.Konqi You can speak all or any selected part of a web page from Konqueror. If you have an XHTML-to-SSML filter configured in KTTS, it will "rich speak" the page in a variety of rates and volumes.KateKSayItKMouthKPDF smiley-sad You can select text in KPDF and send it to KTTSD. Unfortunately, this means you are limited to speaking at most a single page at a time. I believe the KPDF team plans to migrate to the Poppler engine in KPDF, which should permit speaking all the text of an entire pdf file.amaroK + heart The TTS in amaroK plugin, available on will speak the artist and title of each track as it begins, in effect giving amaroK a "disc jockey". Nice.KAlarmOther KDE AppsKNotify heart Using KTTSMgr, you can configure KTTS to speak events sent from applications via KNotify. For example, I have KTTS configured to speak messages from Konversation whenever someone mentions my nickname. Filters take care of removing markup and converting chat emoticons and abbreviations to speakable words. For example, the following message in Konversation <wheels> hi PhantomsDad :)is spoken as wheels says Hi PhantomsDad smilesYou can configure individual events, all events from an application, and all events not otherwise covered. For each event, you can *. Speak the event message. *. Speak the name of the event. *. Not speak the event. *. Speak a custom message.For each event, you can specify the KTTS talker attributes you want. For example, you can specify that an event is spoken in a female voice.

KTTSD is a non-GUI application that accepts requests for speech from applications via DCOP. It takes care of message queueing and prioritization. KTTSD asynchronously runs the filters, TTS synthesizers, stretcher (sox), and audio playback components.

KTTSD endeavors to keep audip playback continuously going by keeping up to 3 sentences in its output queue ready for playback. While a sentence is being played back, the next sentences are being synthesized and time stretched (sox).

Aynchronous processing is achieved by:

KProcess (XML Transformer filter, synths, and Stretcher)

QThread (String Replacer and SBD filters)

QTimer (audio playback)

As each component completes its task, KTTSD is signaled and the next step is initiated.