Software Projects

What Language is This? Language Identifier for Android

Posted in Uncategorized by rmtheis on October 3, 2019

This app helps you to identify the language of a given piece of writing. It uses an n-grams probabilistic model to make an educated guess about the natural language in which your text sample is written: the Language Identifier app can answer the question “what language is this?“.

Red Tide Tracker for Android

Posted in Uncategorized by rmtheis on October 3, 2019

This app tracks harmful algal blooms in coastal areas in Florida: Red Tide

Wildfire Info Tracker for Android

Posted in Uncategorized by rmtheis on October 3, 2019

This app lets you track current wildfire perimeters as well as smoke areas in the United States: Fireguard for Android

The Fireguard app provides a geographic view of wildfire activity for all areas worldwide. For areas in the United States, the app also overlays curated wildfire perimeters and satellite-detected regions of smoke onto a single map view. Includes VIIRS/MODIS hotspots from NASA/NOAA/USGS.

Using both VirtualBox and Android Emulators on AMD-based PCs

Posted in Uncategorized by rmtheis on July 28, 2018

With an AMD-based PC running Windows 10, changing the hypervisor launch type is required in order to switch between using VirtualBox and using x86 Android emulators.

Virtualbox requires the launch type to be “off”, and x86 Android emulators require the launch type to be “auto”. Enter the following commands in a Command Prompt or a PowerShell window started with “Run As Administrator”.


To enable Virtualbox:

bcdedit /set hypervisorlaunchtype off


To enable x86 Android Emulators:

bcdedit /set hypervisorlaunchtype auto


Google Goggles isn’t dead

Posted in Uncategorized by rmtheis on February 4, 2017

Google Goggles is an app that first appeared around 2012 and was ahead of its time, yet it was too unfocused to gain widespread user acceptance despite getting over 10 million downloads. Marketed as a generalized way to digitally identify and make sense of anything that you could point your camera at, Google Goggles left many users confused as to what exactly to do with it. Now, several years later, the app has been removed from the iOS app store, and the Android version hasn’t been updated in 2-1/2 years. But despite the fading relevance of the app itself, the technologies that made up the Google Goggles app are very much alive, having been chopped up and redistributed among Google’s other offerings.

General object identification in images: Cloud Vision API

The Google Cloud Vision API recognizes things like landmarks, artwork, and products. The Goggles app collected data to train the cloud vision API models in the same way that the 1-800-GOOG-411 telephone directory assistance service collected voice data to train Google’s speech recognition models. For example, data is gathered in part by Goggles’ “Search from camera” mode that vacuums up all the photos you take with your phone camera. Now the resulting object recognition capability is available for a fee from the Cloud Vision API, where Google continues to gather data and improve its models. Image-based search is also available through the Google search engine and a shortcut in the Chrome web browser.

On-device frame-based processing, text detection and OCR, and barcode scanning: Mobile Vision API

Google’s Mobile Vision API is a separate system from the Cloud Vision API that works offline – that is, it can run on a phone when there’s no network connection available because it functions as a part of Google Play Services. It provides optical character recognition that’s based on Tesseract and currently works for languages with Latin-based alphabets. The Mobile Vision API also provides the same optical flow-based object tracking that the Goggles app used. The Mobile Vision API is extensible and developer-friendly too. If a developer wants to implement a custom image processing system such as, say, overlaying graphics onto faces like the Snapchat dog face filter, they can do that with the Mobile Vision API.

Translation of text visible from the camera: Google Translate

The text translation features originally available in Goggles have been superseded by those now available in the Google Translate app. Google Translate provides a better interface than Goggles because the language is already selected by the user, eliminating the need for identifying the language of printed text in a given image, thereby removing a potential source of error. Further, Google Translate’s on-device image processing allows for fast OCR that enables a quick translation at the per-word (but not per-sentence) level.

Exploring your world with a camera: Google Cardboard VR headset

The idea of parsing and augmenting what you see using image processing is incorporated into Google Cardboard. The design is such that it allows the wearer to get input from the camera as they wear the headset. We may start to see Street View and dashcam-gathered data integrated into this type of an augmented reality system.

New hardware-software integration: Pixel handset

Google Goggles uses an image blur detection algorithm to determine when the device camera is out of focus, triggering the camera autofocus cycle in response, and thereby setting up the camera input for optimal scanning. A similar integration of software and hardware is used in the accelerometer-based camera stabilization incorporated into the Pixel’s high-end camera, which provides a smooth and fast camera input even when the user has shaky hands.

Future capabilities

When the Goggles app was split apart, its on-device capabilities ended up in the Mobile Vision API, and its cloud-based capabilities ended up in the Cloud Vision API. Given the current trends, the Google Mobile Vision API is best positioned to reveal important capabilities over the coming years. As device CPU speeds increase and multicore handsets proliferate, more and more powerful image processing will be able to run on the device itself, without incurring the slowdown required to transmit images to a cloud-based API. Video input, as an alternative to one-frame-at-a-time image processing, will become more achievable. Developers will have the flexibility to create a variety of apps around Google’s models through their APIs. Users will be able to make sense of image-based data with more speed and clarity. We’ll see more apps doing something like serving as a generalized scanner for all camera input, recognizing objects of all types. The incremental improvements spearheaded by Google will continue to power new apps as the company turns its old experiments into new APIs and products.

Parallel Machine Translations as an Aid to Human Translators

Posted in Uncategorized by rmtheis on October 18, 2015

Anyone who’s used machine translation tools like Google Translate knows that machine translation is an inexact science. Mistakes and mistranslations are common, and the accuracy of machine translations seems to range from “acceptable” to “completely wrong”. Clearly, there is an opportunity for new approaches to help people get better results from machine translation.

Translation inaccuracies arise due to imperfect language models and unclear input text. Machine translation systems tend to have problems when translating text between languages that are very linguistically different from one another. The format of the text makes a huge difference too: Input text that is short or idiomatic similarly leads to inaccuracies.

Even with these inaccuracies in translation, the resulting imperfect translations can leave the user with the gist of the original underlying meaning, or at least a hint or a starting point for a better translation. Human translators and those with some proficiency in the target language will often use more than one system– both Google and Microsoft, for example–to help them compare results and choose between the translations.

Below is a link to an Android app I’ve developed that performs machine translations in parallel using different systems to make this comparison easier. The app shows results on the screen at the same time from different machine translation systems: Google Translate, Microsoft Translator, and Yandex Translate. I’d be interested in getting feedback on whether this approach is helpful for mobile users.

Android app on Google Play

Receiving OCR Progress Updates when Using Tesseract on Android

Posted in Uncategorized by rmtheis on December 17, 2014

The running time required to perform optical character recognition is influenced by the size of the image and the language of the text being recognized. When running OCR on more than a small block of text, or when using a language with many characters (like Chinese), the time delay required to perform OCR can be an annoyance to users.

To provide a better user experience, I’ve added some code written by Renard Wellnitz for Text Fairy to provide a progress callback method to the tess-two API. Objects implementing this method will receive updates during OCR with the percent complete and coordinates of the bounding box around the word that the OCR engine is currently working on.

The progress percentage can be used in a thermometer-style ProgressBar. The bounding boxes can be drawn on top of the display of the input image during recognition.

Implementing this callback requires using an alternate constructor for the TessBaseAPI object and implementation of the ProgressNotifier interface:

ProgressBar progressBar = (ProgressBar) findViewById(;

// Create the TessBaseAPI object, and register to receive OCR progress updates
TessBaseAPI baseApi = new TessBaseAPI(this);

public void onProgressValues(ProgressValues progressValues) {

Building an Apertium Standalone Language Pair Translation Jar Package for Android on Ubuntu

Posted in Uncategorized by rmtheis on June 16, 2013

This is a procedure for creating standalone packages that can be bundled with Android apps for supporting in-app language translation while offline–that is, without a cellular or wifi data connection.

Requires: Android SDK

Install required packages:
sudo apt-get install subversion libxml2-dev xsltproc flex libpcre3-dev gawk libxml2-utils

Get Apertium repository code:
svn co apertium

Compile and install lttoolbox:
cd apertium/trunk/lttoolbox
PKG_CONFIG_PATH=/usr/local/lib/pkgconfig ./
sudo make install
sudo ldconfig

Compile and install apertium:
cd ../apertium
PKG_CONFIG_PATH=/usr/local/lib/pkgconfig ./
sudo make install
sudo ldconfig

Compile and install lttoolbox-java:
cd ../lttoolbox-java
sudo make install

Compile a language pair (for example, English-Spanish):
(Android-related note: If you see ‘you don’t have cg-proc installed’ then this pair requires the constraint grammar
package, so this pair is not Android compatible.)
cd ../apertium-en-es
PKG_CONFIG_PATH=/usr/local/lib/pkgconfig ./
sudo make install
echo 'test' | apertium en-es

Create a symbolic link in the former location of Android’s ‘dx’:
cd /home/$USER/android-sdk-linux/platform-tools
ln -s ../build-tools/17.0.0/dx dx

Compile the standalone package for a language pair (for example, English-Spanish):
cd apertium/trunk/lttoolbox-java
export LTTOOLBOX_JAVA_PATH='/usr/local/share/apertium/lttoolbox.jar'
export ANDROID_SDK_PATH='/home/$USER/android-sdk-linux'
./apertium-pack-j /usr/local/share/apertium/modes/en-es.mode /usr/local/share/apertium/modes/es-en.mode

At this point apertium-en-es.jar has been created in apertium/trunk/lttoolbox-java.


Install language pairs:
Build standalone language pair packages:

Looking For Words With an Edit Distance of 1 or 2 From Other Words

Posted in Uncategorized by rmtheis on November 3, 2012

This code is a modification of Peter Norvig’s spelling corrector that adds the closest_nearby_word() method, which identifies the most-frequently-seen correctly-spelled word that has an edit distance of 1 or 2 from the given correctly-spelled word.

# -*- coding: utf-8 -*-

An altered version of Peter Norvig's spelling corrector

from collections import Counter
import re

# Get the whitespace-delimited words from a text, minus any punctuation
def words(text): return re.findall('[a-z]+', text.lower())

# Count the frequency with which each word occurs
def train(features):
 model = Counter()
 for f in features:
 model[f] += 1
 return model

# Run training using a book with words we'll consider to be spelled correctly
NWORDS = train(words(file('big.txt').read()))

# Get strings with an edit distance of 1 from the given word
def edits1(word):
 alphabet = 'abcdefghijklmnopqrstuvwxyz'
 splits = [(word[:i], word[i:]) for i in range(len(word) + 1)]
 deletes = [a + b[1:] for a, b in splits if b]
 transposes = [a + b[1] + b[0] + b[2:] for a, b in splits if len(b)>1]
 replaces = [a + c + b[1:] for a, b in splits for c in alphabet if b]
 inserts = [a + c + b for a, b in splits for c in alphabet]
 return set(deletes + transposes + replaces + inserts)

# Get strings with an edit distance of 2 from the given word
def known_edits2(word):
 return set(e2 for e1 in edits1(word) for e2 in edits1(e1) if e2 in NWORDS)

# Get any known words matching the given mutated words
def known(words):
 return set(w for w in words if w in NWORDS)

# Suggest a correction by mutating a word and choosing the most likely replacement
# based on how often the mutated words appear in the trained model NWORDS
def correct(word):
 candidates = known([word]) or known(edits1(word)) or known_edits2(word) or [word]
 print candidates
 return max(candidates, key=NWORDS.get)

# Get the likeliest correctly-spelled word
def closest_nearby_word(word):
 nearby = set()
 for e in known(edits1(word) or known_edits2(word)):
 if (e != word):
 if not nearby: return set()
 return max(nearby, key=NWORDS.get)

#print correct('speling')

# Run some test cases for finding "nearby" words
for w in frozenset(['rational', 'woman', 'rogue', 'effect', 'started', 'rein',
 'scalded', 'mislead', 'reality', 'whit', 'marshal', 'voila',
 'aide', 'tiered', 'county', 'fires', 'stated', 'soldier',
 'beset', 'affect', 'vice', 'wreck', 'spayed', 'complimentary',
 'their', 'principal', 'moral', 'especially', 'steal',
 'personal', 'why', 'heroine', 'descendant', 'baited',
 'interested', 'sole', 'think', 'physics', 'corps', 'discrete']):
 print w, "-", closest_nearby_word(w)


With a better training corpus, this method could possibly be used to identify misspelled words that are overlooked by most spell checkers. But better starting points are available.

think - thing
corps - crops
stated - states
baited - waited
aide - side
beset - best
fires - fire
scalded - scolded
moral - morel
whit - what
principal - principals
wreck - wrack
personal - set([])
heroine - heroin
reality - set([])
their - theirs
interested - set([])
voila - set([])
woman - women
rational - national
started - stated
sole - some
effect - effects
rogue - vogue
affect - effect
why - who
descendant - descendants
county - count
spayed - stayed
especially - specially
vice - voice
physics - set([])
discrete - discreet
tiered - tired
mislead - misled
soldier - soldiers
rein - vein
complimentary - complementary
steal - steel
marshal - marshall

A Continuous Floating-Point Adaptation of Conway’s Game of Life

Posted in Uncategorized by rmtheis on October 13, 2012

This video shows a continuous, floating-point adaptation of Conway’s Game of Life.

I rendered this video by running Ready using the SmoothLifeL parameter set. World size is 2048×2048. Stephan Rafler’s paper explains the math behind the video. Rendering took eight hours on my video card.

Frames were converted to video using:

ffmpeg -s hd1080 -r 30 -b 9600 -i frame_%06d.png video.mp4