[PD] pd & text->speech
Bryan Jurish
moocow at ling.uni-potsdam.de
Wed Jun 12 20:50:49 CEST 2002
Greetings,
> ----- Original Message -----
> From: "Michal Seta" <mis at creazone.com>
> To: <pd-list at iem.kug.ac.at>
> Sent: Wednesday, June 12, 2002 7:45 AM
> Subject: [PD] pd & text->speech
>
>
> > Hello.
> >
> > First, I admit that I have not yet played with any of the text->speech (on
> linux) software so I don't know which one is what. However, I'd be
> interested in using smething like that with(in) pd. Has anyone done
> anything like that? Any ideas? If there's anything I miss from Max is the
> ispeak object :)
other possibilities (linux) i've looked at:
mbrola (http://tcts.fpms.ac.be/synthesis)
+ free for non-commercial and non-military use
(in a prior email message, mbrola author Thierry Dutoit
indicated to me that use of mbrola in a musical
performance would not violate the 'no commercial use'
clause)
+ diphone synthesis; sounds quite good
+ supports english, german, and others
- no library, so you have to use "piperead~" from
ext13 for output
- mbrola has to know phone length and frequency envelope
before any sound is produced
- i tested using netsend/netreceive and a perl
script to wrap mbrola -- gets pretty unwieldy
festival (http://www.cstr.ed.ac.uk/projects/festival.html)
+ GPL
+ based on siod, maybe eventual integration into pd
would be possible via (a modified version of) Larry's
"pd-scheme"
+ abstracts over various synthesis methods: mbrola
output also possible
- i'm still using piperead~, netsend/netreceive, and
perl to wrap it
- so far, i've only been able to figure out how to
get festival to do "pure" text-to-speech: i can't figure
out how to influence prosodic parameters like frequency
or timing, although the mechanisms are certainly there
- it eats cpu time
... if you're interested in automated generation of syntactically
correct nonsense (or pure text-to-speech), you might want
to check out my "SayWhat" package, which has a PD mode and
some basic example patches, at:
http://www.ling.uni-potsdam.de/~moocow/projects/saywhat
... the PD support is currently just netreceive + piperead~,
but hopefully that will change soon...
On 12 June 2002 at 07:44:34, sme wrote:
> hi
> i'm not shure, if a single object can handle a complex process like speech
> synthesis.
> i rather have the idea of a modular system (which still could be realized in
> pd) whith a kind of physical model of the anatomic speech-producing parts of
> the body and a rather complicated interface-speech to control it and a way
> to interprets/translates text to it.
> sÜme.
i agree this would be better. i started trying to adapt Nick
Ing-Simmons' "rsynth" Klatt-style synthesizer for eventual integration
into pd (via SayWhat) -- the current state of affairs is available at:
http://www.ling.uni-potsdam.de/~moocow/projects/spsyn
... but that project has a longish way to go before the kind of
fine-grained control that i would like is available. also, according
to Nick, there are still some unclear copyright issues with the
code from the original "rsynth-2.0":
On Wed, 22 May 2002 at 23:37:24, Nick Ing-Simmons wrote:
> > It like most of rsynth stuff is stalled due to
> > ownership issues of the Klatt synthesis code. The man has died
> > so cannot give permission ...
it would probably also be possible to build a "pure" pd
Klatt-style speech synthesizer from scratch around Yves'
"formant~" object, but i'm still waiting for the library
here to dig up their copy of the Klatt article for me ;-)
is anyone else working on tts / speech synthesis for pd? if so,
maybe we could combine our efforts?
marmosets,
Bryan
More information about the Pd-list
mailing list