Software Projects

Receiving OCR Progress Updates when Using Tesseract on Android

Posted in Uncategorized by rmt on December 17, 2014

The running time required to perform optical character recognition is influenced by the size of the image and the language of the text being recognized. When running OCR on more than a small block of text, or when using a language with many characters (like Chinese), the time delay required to perform OCR can be an annoyance to users.

To provide a better user experience, I’ve added some code written by Renard Wellnitz for Text Fairy to provide a progress callback method to the tess-two API. Objects implementing this method will receive updates during OCR with the percent complete and coordinates of the bounding box around the word that the OCR engine is currently working on.

The progress percentage can be used in a thermometer-style ProgressBar. The bounding boxes can be drawn on top of the display of the input image during recognition.

Implementing this callback requires using an alternate constructor for the TessBaseAPI object and implementation of the ProgressNotifier interface:

ProgressBar progressBar = (ProgressBar) findViewById(R.id.progressBar1);

// Create the TessBaseAPI object, and register to receive OCR progress updates
TessBaseAPI baseApi = new TessBaseAPI(this);

baseApi.getHOCRText(myImage);
@Override
public void onProgressValues(ProgressValues progressValues) {
    progressBar.setProgress(progressValues.getPercent());
}
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s