Convert image to text

There are a number of OCR readers for linux that can convert from image to text. Look at the following options:

  • GOCR: Wikipedia page
  • Ocrad: Wikipedia page
  • ocropus: Wikipedia page
  • tesseract-ocr: Wikipedia page

All the above, except ocropus, are present in the Ubuntu repository in a package of the same name.

Different readers support different image formats, so you may be limited in your options by the file format your document is in. Alternatively, you can use the convert tool from ImageMagick to change the format if you wish to use a particular OCR reader.

Adapted from my answer here.