Can we use Yolo to detect and recognize text in a image

I have a similar question and I am making a digit detection model with svhn dataset. It is not a finished project yet, but it seems to work well. You can see the code at Yolo-digit-detector.


If you use the pretrained model, you would need to save those outputs and input the images into a character recognition network, if using neural net, or another approach.

What you are doing is "scene text recognition". You can check out the Reading Text in the Wild with Convolutional Neural Networks paper, here's a demo and homepage. Github user chongyangtao has a whole list of resources on the topic.