[PD] speech recognition and ethics

Ivica Ico Bukvic ico at vt.edu
Sat Feb 7 19:28:58 CET 2015


There is still the access to computational power challenge, unless we 
make a seti at home-like speech recognition crawler which in and of itself 
has similar ethical implications.

On 2/7/2015 12:55 PM, Spencer Russell wrote:
> I saw a really interesting talk last year by Johan Schalkwyk, the head 
> of the Google speech recognition group. One of the points he made was 
> that while Google's algorithms are important, they got a lot more 
> leverage from the sheer amount of data they have access to. It allows 
> them to get away with much simpler algorithms. I think that's one of 
> the biggest problems with trying to compete with Google and Apple on 
> speech recognition, because OSS developers just don't have access to a 
> huge corpus of data.
> Even though a lot of that data is unlabeled (they don't know what the 
> actual words are that correspond to the audio), they have a huge 
> amount of interaction data, so they can for instance look at whether 
> the user tried multiple times with a particular phrase or whether the 
> user accepted a given transcription.
> It seems like if we want an open-source speech recognition package we 
> should focus on finding ways to get an accessible shared corpus. 
> Unless there was some tricky licensing I think that corpus would also 
> benefit the big guys though, so their corpus would remain a proper 
> superset of what's available to OSS developers.
> On Sat, Feb 7, 2015, at 11:39 AM, Jonathan Wilkes via Pd-list wrote:
>> Hi list,
>> Here's a fun thought-experiment: suppose you're doing a port of Pd, 
>> and the graphics toolkit you're using will include functionality to 
>> hook in to Google's speech recognition API.  Such an API could make 
>> the software accessible to people who would otherwise find it very 
>> hard to write Pd patches.
>> However, the API works by shipping off your audio data to Google's 
>> servers, doing the computation on their machines, and sending you 
>> back the results.
>> Do you use the API in your port, or not?
>> I'm decidedly not going to use that API, for what I think are obvious 
>> security, privacy, and philosophical reasons.  But I'm curious just 
>> how obvious the security and privacy implications are to others 
>> here.  How many people would use a speech-patching mechanism that 
>> sends all your speech to Google?
>> I'm also increasingly worried by the apparent gap between the 
>> usability of Google and Apple's products, and the seemingly glacial 
>> pace at which _usable_ free software speech recognition is being 
>> developed.  My position won't change, but I'm afraid it's becoming 
>> more symbolic than practical as these insecure tools become a natural 
>> part of most people's lives.
>> -Jonathan
>> _________________________________________________
>> Pd-list at lists.iem.at <mailto:Pd-list at lists.iem.at> mailing list
>> UNSUBSCRIBE and account-management -> 
>> http://lists.puredata.info/listinfo/pd-list
>
>
> _______________________________________________
> Pd-list at lists.iem.at mailing list
> UNSUBSCRIBE and account-management -> http://lists.puredata.info/listinfo/pd-list

-- 
Ivica Ico Bukvic, D.M.A.
Associate Professor
Computer Music
ICAT Senior Fellow
DISIS, L2Ork
Virginia Tech
School of Performing Arts – 0141
Blacksburg, VA 24061
(540) 231-6139
ico at vt.edu
www.performingarts.vt.edu
disis.music.vt.edu
l2ork.music.vt.edu

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.puredata.info/pipermail/pd-list/attachments/20150207/86048fba/attachment.html>


More information about the Pd-list mailing list