<html>
<head>
<meta content="text/html; charset=utf-8" http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
There is still the access to computational power challenge, unless
we make a seti@home-like speech recognition crawler which in and of
itself has similar ethical implications.<br>
<br>
<div class="moz-cite-prefix">On 2/7/2015 12:55 PM, Spencer Russell
wrote:<br>
</div>
<blockquote
cite="mid:1423331706.1597882.224374225.2BD9D12C@webmail.messagingengine.com"
type="cite">
<title></title>
<div>I saw a really interesting talk last year by <span
class="highlight" style="background-color: rgb(255, 255, 255)"><span
class="colour" style="color:rgb(31, 31, 31)">Johan
Schalkwyk, </span></span>the head of the Google speech
recognition group. One of the points he made was that while
Google's algorithms are important, they got a lot more leverage
from the sheer amount of data they have access to. It allows
them to get away with much simpler algorithms. I think that's
one of the biggest problems with trying to compete with Google
and Apple on speech recognition, because OSS developers just
don't have access to a huge corpus of data. <br>
</div>
<div> </div>
<div>Even though a lot of that data is unlabeled (they don't know
what the actual words are that correspond to the audio), they
have a huge amount of interaction data, so they can for instance
look at whether the user tried multiple times with a particular
phrase or whether the user accepted a given transcription.<br>
</div>
<div> </div>
<div>It seems like if we want an open-source speech recognition
package we should focus on finding ways to get an accessible
shared corpus. Unless there was some tricky licensing I think
that corpus would also benefit the big guys though, so their
corpus would remain a proper superset of what's available to OSS
developers.</div>
<div> </div>
<div> </div>
<div>On Sat, Feb 7, 2015, at 11:39 AM, Jonathan Wilkes via Pd-list
wrote:<br>
</div>
<blockquote type="cite">
<div style="color:#000; background-color:#fff;
font-family:HelveticaNeue, Helvetica Neue, Helvetica, Arial,
Lucida Grande, sans-serif;font-size:16px">
<div dir="ltr">Hi list,<br>
</div>
<div dir="ltr"> </div>
<div dir="ltr">Here's a fun thought-experiment: suppose you're
doing a port of Pd, and the graphics toolkit you're using
will include functionality to hook in to Google's speech
recognition API. Such an API could make the software
accessible to people who would otherwise find it very hard
to write Pd patches.<br>
</div>
<div dir="ltr"> </div>
<div dir="ltr">However, the API works by shipping off your
audio data to Google's servers, doing the computation on
their machines, and sending you back the results.<br>
</div>
<div dir="ltr"> </div>
<div dir="ltr">Do you use the API in your port, or not?<br>
</div>
<div dir="ltr"> </div>
<div dir="ltr">I'm decidedly not going to use that API, for
what I think are obvious security, privacy, and
philosophical reasons. But I'm curious just how obvious the
security and privacy implications are to others here. How
many people would use a speech-patching mechanism that sends
all your speech to Google?<br>
</div>
<div dir="ltr"> </div>
<div dir="ltr">I'm also increasingly worried by the apparent
gap between the usability of Google and Apple's products,
and the seemingly glacial pace at which _usable_ free
software speech recognition is being developed. My position
won't change, but I'm afraid it's becoming more symbolic
than practical as these insecure tools become a natural part
of most people's lives.<br>
</div>
<div> </div>
<div dir="ltr">-Jonathan<br>
</div>
</div>
<div><u>_______________________________________________</u><br>
</div>
<div><a moz-do-not-send="true"
href="mailto:Pd-list@lists.iem.at">Pd-list@lists.iem.at</a>
mailing list<br>
</div>
<div>UNSUBSCRIBE and account-management -> <a
moz-do-not-send="true"
href="http://lists.puredata.info/listinfo/pd-list">http://lists.puredata.info/listinfo/pd-list</a><br>
</div>
</blockquote>
<div> </div>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
<a class="moz-txt-link-abbreviated" href="mailto:Pd-list@lists.iem.at">Pd-list@lists.iem.at</a> mailing list
UNSUBSCRIBE and account-management -> <a class="moz-txt-link-freetext" href="http://lists.puredata.info/listinfo/pd-list">http://lists.puredata.info/listinfo/pd-list</a>
</pre>
</blockquote>
<br>
<pre class="moz-signature" cols="72">--
Ivica Ico Bukvic, D.M.A.
Associate Professor
Computer Music
ICAT Senior Fellow
DISIS, L2Ork
Virginia Tech
School of Performing Arts – 0141
Blacksburg, VA 24061
(540) 231-6139
<a class="moz-txt-link-abbreviated" href="mailto:ico@vt.edu">ico@vt.edu</a>
<a class="moz-txt-link-abbreviated" href="http://www.performingarts.vt.edu">www.performingarts.vt.edu</a>
disis.music.vt.edu
l2ork.music.vt.edu</pre>
</body>
</html>