Google Cloud Vision - Numbers and Numerals OCR

At this moment it is not possible to add constraints or to give a specific expected number format to Vision API requests, as mentioned here (by the Project Manager of Cloud Vision API).

You can also check all the possible request parameters (in the API reference), none indicating anything to specify number format. Currently only options to:

  • latLongRect: specify location of the image
  • languageHints: indicating the expected language for text_detection (list of supported languages here)

I assume you already checked out the multiple responses (with different included image regions) to see if you could reconstruct the text using the location of different digits?

Note that the Vision API and text_detection is not optimized for your data specifically, if you would have a lot of annotated data, it is also an option to actually build your own model using Tensorflow. This blogpost explains a system setup to detect number plates (with a specific number format). All the code is available on Github and the problem seems very related to yours.


I am unable to tell you why this works, perhaps it has to do with how the language is read, o vs 0, l vs 1, etc. But whenever I use OCR and I am specifically looking for numbers, I have read to set the detection language to "Korean". It works exceptionally well for me and has influenced the accuracy greatly.