[PD] detect if adc~ is music or speech

patrick puredata at 11h11.com
Mon Feb 7 18:43:31 CET 2011


would it be possible to detect if the incoming audio is music or speech? 
i guess it's very hard, but i was thinking about some methods:

using some kind of frequency detection
using bonk (if the tempo is stable = music)
env~ (most music are compressed nowadays)
training a voice (using neural network?!?)


 From the author of aubio:
Use a few low level features, such as energy of low and high frequencies 
bands, spectral spread. In a second step, these approaches are often 
refined using machine learning techniques bayesian networks or support 
vector machines.

See for instance these papers:
http://cobweb.ecn.purdue.edu/~malcolm/interval/1996-085/
http://www.aclweb.org/anthology/O/O08/O08-1015.pdf
http://www.hindawi.com/journals/asp/2009/628570.html

i would like to achieve > 90% of accuracy if possible. any suggestions 
are welcome!



More information about the Pd-list mailing list