You are here

Using Tesseract OCR in OpenFrameworks

See discussion here: http://www.openframeworks.cc/forum/viewtopic.php?f=10&t=3728&hilit=OCR

I am doing this in OS X 10.6.5.

Download recent tesseract source code from here: http://code.google.com/p/tesseract-ocr/

(tesseract-3.00.tar.gz at the time of writing)

     1. In terminal: cd ~/Downloads/
     2. gunzip tesseract-3.00-tar.gz
     3. tar -xvf tesseract-3.00-tar
     4. mv tesseract-3.00 ~/code
     5. cd ~/code/tesseract-3.00/
     6. ./configure
     7. make
     8. sudo make install

Download tesseract example mentioned in openframeworks forum: http://julapy.com/source/tesseractExample.zip

     1. cd ~/Downloads
     2. unzip tesseractExample.zip
     3. mv tesseractExample ~/code/openFrameworks/apps/YOURPATHHERE
     4. open with xcode (I am using xcode 3.2.4)
     5. You need to specify the path for tesseract data or the example won't work. To do this:
          a. Double click on "emptyExample" under "Executables" in the "Groups & Files" portion of your project window.
          b. Click "+" under "Variables to be set in the environment", and add a new variable:
               1. Name: TESSDATA_PREFIX
               2. Value: ../../../data/
          c. if you do not do this, your program will run then crash. Inspecting in console, you will see that it is "Unable to load unicharset file /usr/local/share/tessdata/eng.unicharset", it was looking in the wrong directory.
     6. Build and Run!

You should see this:

Recognition in Live Video

 

sorry! couldn't upload my zip file... grab it from svn:

http://svn.roberttwomey.com/of/tesseractVideoExample/


Forward the future! More to come...