Shape Recognition

The shape recognition algorithm used by Comp’s drawing gestures used a combination of machine-learning-tuned algorithms and hand-coded rules to reliably recognize hand-drawn shapes that ranged widely in quality.

Data Collection and Tagging

For recognizing the drawn shapes correctly, it was important to collect as much sample data as possible. Shapes drawn by our internal testers and users in our beta program were added to our database, which we could use as training and test data to improve and evaluate our algorithm.

Initially, each drawing was individually tagged with the correct shape, using an internal tool I developed, but this quickly became untenable as we collected tens of thousands of gestures.

So, building on the fact that our algorithm was already becoming reliable, I added the ability to “batch tag” gestures based on how the algorithm had categorized them. This way, you could look at any recently-collected data, filter down to 500 shapes categorized as circles, and quickly approve the 490 of them that were recognized correctly, while only having to individually-process the others.

Individually tagging drawings, before “batch tagging” was implemented.

Machine Learning + Hand-Coded Rules

For optimum performance, the recognition engine used machine-learning-tuned recognition models to distinguish difficult-to-recognize shapes. However it was also possible to combine these with hand-coded rules, to write higher-level recognizers, or prototype new additions.

For example, when we decided to add the “eyedropper” gesture. Instead of needing to collect samples that would cover enough relevant permutations of the gesture, I wrote a simple rule on top of the existing ability to recognize circles and dots1, and was able to test the new recognizer with only a few lines of code.

The app could already recognize circles and dots, so adding an “eyedropper” gesture was easy, without retraining any recognition models.

Extensive Automated Testing

All of the data we collected could be re-run against our recognition algorithms in the future whenever changes were made, so that we could easily identify regressions. If a new hand-coded rule was added, we could diff the results of these tests to quickly identify if the rules would interfere with any drawings.

Alignments and placements could also be tested in this way, meaning changes to those algorithms could also be evaluated more easily.

C++ Port

The library was originally created in Objective-C, but I later worked with Adobe’s Core Technologies team to port it to C++ for use in the Android version of Comp, as well as in other products. As we ported the library, I took advantage of the move to rewrite several key features of the recognition algorithm to be as much as 3x faster than the original version, and also restructured the code to make prototyping and future additions easier.

More Information

Footnotes

  1. Before dots were used in the eyedropper gesture, they were used in the gesture for text, and also as a way to commit any pending shapes. ↩︎