Improve real-life results of neural network trained with mnist dataset

Update:

You have three options to achive a better performance in this particular task:

Use Convolutional network as it performs better in tasks with spatial data, like images and are more generative classifier, like this one.
Use or Create and/or generate more pictures of your types and train your network with them your network to be able to learn them too.
Preprocess your images to be better aligned to the original MNIST images, against which you trained your network before.

I've just made an experiment. I checked the MNIST images regarding one represented number each. I took your images and made some preprocessing I proposed to you earlier like:

1. made some threshold, but just downwards eliminating the background noice because the original MNIST data has some minimal threshold only for the blank background:

image[image < 0.1] = 0.

2. Surprisingly the size of the number inside of the image has proved to be crucial, so I scaled the number inside of the 28 x 28 image e.g. we have more padding around the number.

3. I inverted the images as the MNIST data from keras has inverted also.

image = ImageOps.invert(image)

4. Finally scaled data with, as we did it at the training as well:

image = image / 255.

After the preprocessing I trained the model with MNIST dataset with the parameters epochs=12, batch_size=200 and the results:

enter image description here

Result: 1 with probabilities: 0.6844741106033325

 result:  **1** . probabilities:  [2.0584749904628552e-07, 0.9875971674919128, 5.821426839247579e-06, 4.979299319529673e-07, 0.012240586802363396, 1.1566483948399764e-07, 2.382085284580171e-08, 0.00013023221981711686, 9.620113416985987e-08, 2.5273093342548236e-05]

enter image description here

Result: 6 with probabilities: 0.9221984148025513

result:  6 . probabilities:  [9.130864782491699e-05, 1.8290626258021803e-07, 0.00020504613348748535, 2.1564576968557958e-07, 0.0002401985548203811, 0.04510130733251572, 0.9221984148025513, 1.9014490248991933e-07, 0.03216308355331421, 3.323434683011328e-08]

enter image description here

Result: 7 with probabilities: 0.7105212807655334 Note:

result:  7 . probabilities:  [1.0372193770535887e-08, 7.988557626958936e-06, 0.00031014863634482026, 0.0056108818389475346, 2.434678014751057e-09, 3.2280522077599016e-07, 1.4190952857262573e-09, 0.9940618872642517, 1.612859932720312e-06, 7.102244126144797e-06]

Your number 9 was a bit tricky:

enter image description here

As I figured out the model with MNIST dataset picked up two main "features" regarding 9. Upper and lower parts. Upper parts with nice round shape, as on your image, is not a 9, but mostly 3 for your model trained against the MNIST dataset. Lower part of 9 is mostly a straighten curve as per the MNIST dataset. So basicly your perfect shaped 9 is always a 3 for your model because of the MNIST samples, unless you will train again the model with sufficiant amount of samples of your shaped 9. In order to check my thoughts I made a subexperiment with 9s:

My 9 with skewed upper parts (mostly OK for 9 as per MNIST) but with slightly curly bottom (Is not OK for 9 as per MNIST):

enter image description here

Result: 9 with probabilities: 0.5365301370620728

My 9 with skewed upper parts (mostly OK for 9 as per MNIST) and with straight bottom (Is OK for 9 as per MNIST):

enter image description here

Result: 9 with probabilities: 0.923724353313446

Your 9 with the misinterpreted shape properties:

enter image description here

Result: 3 with probabilities: 0.8158268928527832

result:  3 . probabilities:  [9.367801249027252e-05, 3.9978775021154433e-05, 0.0001467708352720365, 0.8158268928527832, 0.0005801069783046842, 0.04391581565141678, 6.44062723154093e-08, 7.099170943547506e-06, 0.09051419794559479, 0.048875387758016586]

Finally just a proof for the image scaling (padding) importance what I mentioned as crucial above:

enter image description here

Result: 3 with probabilities: 0.9845736622810364

enter image description here

Result: 9 with probabilities: 0.923724353313446

So we can see that our model picked up some features, which it interprets, classifies always as 3 in the case of an oversized shape inside of the image with low padding size.

I think that we can get a better performance with CNN, but the way of sampling and preprocessing is always crucial for getting the best performance in an ML task.

I hope it helps.

Update 2:

I found another issue, what I checked as well and proved to be true, that the placement of number inside of image is crucial as well, which makes sense by this type of NN. A good example the number 7 and 9 which have been placed of center in MNIST dataset, near to bottom of the image resulted in harder or flase classification if we place the new number for classifying in the center of image. I checked the theory shifting the 7s and 9s towards to the bottom, so lefting more place at the top of the image and the result was almost 100% accuracy. As this is a spatial type problem, I guess that, with CNN we could eliminate it with more effectiveness. However would be better, if MNIST was alligned to center, or we can do it programatically to avoid the issue.

Improve real-life results of neural network trained with mnist dataset

Update:

Tags:

Python

Machine Learning

Keras

Mnist

Handwriting Recognition

Related

Recent Posts