[PD] Speech recognition on Pure Data

Claude Heiland-Allen claude at mathr.co.uk
Wed Feb 28 18:16:03 CET 2018


On 28/02/18 16:53, Ariane stolfi wrote:
> I'm planning to work on a project that will use a microphone to query on
> freesound database, and we are planning to use speech recognition to
> send the queries.  We will use pure data installed in a Bella board and
> e-textile sensors to process the sound.
> I'm considering send audio and do the speech recognition in a server
> using the Google API or use Sphynx to do the recognition in Pure data.
> Does anyone here has any experience with this recently and can make a
> suggestion?

I used pocket_sphinx from the command line in one project (that didn't
use Pd)[1] The key line for speech transcription in my shell script is:
utterance="$(pocketsphinx_continuous -infile utterance.wav 2>/dev/null)"
The input was synthetic sound, so rather noise-free ideal conditions.

I imagine that you could use Pd to do something like detect presence of
sound and save the segment to a wav file, calling out to Sphinx with a
scripting external.  The same process would be necessary for Google API
I suppose, unless you intend to be streaming everything there.


[1] https://mathr.co.uk/blog/2017-04-08_divergent_protocol.html

More information about the Pd-list mailing list