Google vision Text Detection response to be line by line

You can add feature hints to your JSON request. For image of a receipt like this, DOCUMENT_TEXT_DETECTION give good results:

{
  "requests": [
    {
      "image": {
        "source": {
          "imageUri": "https://i.stack.imgur.com/TRTXo.png"
        }
      },
      "features": [
        {
          "type": "DOCUMENT_TEXT_DETECTION"
        }
      ]
    }
  ]
}

You can copy the above JSON and paste it into Request Body in the Try This API pane on the documentation page. Result:

4x LOwenbräu Original a 3,00 12,00 1
8x Weissbier dunkel a 3, 3026, 40 1
3x Hefe-Weissbier a 3,30990 1
1x Saft 0,25 2, 50 1
1x Grosses Wasser 2, 40 1
1x Vegetarische Varia 9,90 1
1x Gyros 8,90 1
1x Baby Kalamari Gefu 12,90 !
2x Gyros Folie a 9,9019, 80 1
1x Schaf skäse Ofen 6,90 1
1x Bifteki Metaxa 11,90 1
1x Schweinefilet Meta 13,90 1
1x Stifado 14, 90 1
1x Tee 2, 10 1

Googie Vision is much less configurable than Tesseract at the moment. Because Google is behind both projects, guess which one gonna get higher priority in the future?


This might be a late answer but adding it for future reference. For text which are very far apart the DOCUMENT_TEXT_DETECTION also does not provide proper line segmentation.

The following code does simple line segmentation based on the character polygon coordinates.

https://github.com/sshniro/line-segmentation-algorithm-to-gcp-vision