Image preprocessing for text recognition

Try morphological image processing. Have a look at this. However, it works only on binary images - so you will have to binarize the image( threshold?). Although, it is simple, it is dependent on font size, so one structure element will not work for all font sizes. If you want a generic solution, there are a number of papers for text detection in images - A search of this term in google scholar should provide you with some useful publications.


There's nothing like the best set. Keep in mind that digital images can be acquired by different capture devices and each device can embed its own preprocessing system (filters) and other characteristics that can drastically change the image and even add noises to them. So every case would have to be treated (preprocessed) differently.

However, there are commmon operations that can be used to improve the detection, for instance, a very basic one would be to convert the image to grayscale and apply a threshold to binarize the image. Another technique I've used before is the bounding box, which allows you to detect the text region. To remove noises from images you might be interested in erode/dilate operations. I demonstrate some of these operations on this post.

Also, there are other interesting posts about OCR and OpenCV that you should take a look:

  • Simple Digit Recognition OCR in OpenCV-Python
  • Basic OCR in OpenCV

Now, just to show you a simple approach that can be used with your sample image, this is the result of inverting the color and applying a threshold:

cv::Mat new_img = cv::imread(argv[1]);
cv::bitwise_not(new_img, new_img);

double thres = 100;
double color = 255;
cv::threshold(new_img, new_img, thres, color, CV_THRESH_BINARY);

cv::imwrite("inv_thres.png", new_img);