Convert image based subtitle to text based subtitle inside MKV file

Converting image based subtitles to text is a nontrivial process, as you will need some kind of OCR system to interpret the bitmaps and figure out what the corresponding text is. ffmpeg alone will not do that for you.

I am not aware of any app that will do the whole process in one go, for Linux/UNIX. However, this process should work:

  • Extract the subtitles with mkvextract or ffmpeg
  • Convert the PGS subtitles to DVD SUB format with BDSup2Sub
  • OCR the subtitles into SRT format with VobSub2SRT
  • Mux the subtitles back into an mkv file with mkvmerge or ffmpeg