[PD] OCR with Puredata?

Tedb0t lists at liminastudio.com
Wed Jun 1 18:51:53 CEST 2011


> Is it just me, or it sounds like it's going to take a lot of preprocessing before you can even think of feeding it to a neural network ?

Black/white thresholding and resolution reduction, that's it.

> Human vision is made of a lot more layers of neurons than we can hope to deal with in artificial networks.

It sure is.  Luckily you only need a few to do some basic OCR.  You could try a grid of something like 10x10 pixels just to start, which would require 100 input neurons.  The lower the resolution, the higher your error rate will be, but you can find good compromises.

> At what angles should characters be recognised ?

Well, that's the beautiful thing about neural nets—it just depends on how you train the net.  If you want the net to be able to recognize tilted letters, you can add tilted letters to the training sets.  It can affect 

> But making an OCR using ANN is a lot lot more work than using an OCR library. Making a Pd-to-OCR-library interface is less work than making an OCR abstraction library

Agreed, strongly.  OCR is a really excellent exercise in neural nets (look up Self-Organizing Maps or Kohonen networks), but it's a lot of work.  It would be faster by far to set up an interface as Mathieu suggests.  However, if you want to go through with it anyway, I'd love to help!

It looks like the defacto open ocr lib is Tesseract: http://code.google.com/p/tesseract-ocr/  This would be great to have in Pd.

Incidentally, I can't see any reason why Pd would be "bad at it," since the ANN external uses the C FANN library, which is likely what any other library would use in the first place.

±tedb0t


On Jun 1, 2011, at 11:49 AM, Mathieu Bouchard wrote:

> On Wed, 1 Jun 2011, Jack wrote:
> 
>> You can do this with the use of artificial neural network (for character recognition). There are externals for Pd : http://pure-data.svn.sourceforge.net/viewvc/pure-data/trunk/externals/ann/
> 
> Is it just me, or it sounds like it's going to take a lot of preprocessing before you can even think of feeding it to a neural network ?
> 
> Human vision is made of a lot more layers of neurons than we can hope to deal with in artificial networks.
> 
> At what angles should characters be recognised ?
> 
> Which colour on which colour ?
> 
> You better settle those things first, so that you can figure out how you can reduce your data beforehand.
> 
> But making an OCR using ANN is a lot lot more work than using an OCR library. Making a Pd-to-OCR-library interface is less work than making an OCR abstraction library... and it isn't necessarily because Pd would be bad at it (I don't know about that). It's more because it takes a lot of knowledge to make an OCR library from nearly scratch.
> 
> _______________________________________________________________________
> | Mathieu Bouchard ---- tél: +1.514.383.3801 ---- Villeray, Montréal, QC
> _______________________________________________
> Pd-list at iem.at mailing list
> UNSUBSCRIBE and account-management -> http://lists.puredata.info/listinfo/pd-list

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.puredata.info/pipermail/pd-list/attachments/20110601/4d03738d/attachment-0001.htm>


More information about the Pd-list mailing list