How to detect Text Area from image?

Well, I'm not well-experienced in image processing, but I hope I could help you with my theoretical approach.

In most cases, text is forming parallel, horisontal rows, where the space between rows will contail lots of background pixels. This could be utilized to solve this problem. So... if you compose every pixel columns in the image, you'll get a 1 pixel wide image as output. When the input image contains text, the output will be very likely to a periodic pattern, where dark areas are followed by brighter areas repeatedly. These "groups" of darker pixels will indicate the position of the text content, while the brighter "groups" will indicate the gaps between the individual rows. You'll probably find that the brighter areas will be much smaller that the others. Text is much more generic than any other picture element, so it should be easy to separate.

You have to implement a procedure to detect these periodic recurrences. Once the script can determine that the input picture has these characteristics, there's a high chance that it contains text. (However, this approach can't distinguish between actual text and simple horisontal stripes...)

For the next step, you must find a way to determine the borderies of the paragraphs, using the above mentioned method. I'm thinking about a pretty dummy algorithm, witch would divide the input image into smaller, narrow stripes (50-100 px), and it'd check these areas separately. Then, it would compare these results to build a map of the possible areas filled with text. This method wouldn't be so accurate, but it probably doesn't bother the OCR system.

And finally, you need to use the text-map to run the OCR on the desired locations only.

On the other side, this method would fail if the input text is rotated more than ~3-5 degrees. There's another backdraw, beacuse if you have only a few rows, then your pattern-search will be very unreliable. More rows, more accuracy...

Regards, G.


Take a look at this bounding box technique demonstrated with OpenCV code:

Input:

enter image description here

Eroded:

enter image description here

Result:

enter image description here