Clustering images using unsupervised Machine Learning

Most simple way to get good results will be to break down the problem into two parts :

  1. Getting the features from the images: Using the raw pixels as features will give you poor results. Pass the images through a pre trained CNN(you can get several of those online). Then use the last CNN layer(just before the fully connected) as the image features.
  2. Clustering of features : Having got the rich features for each image, you can do clustering on these(like K-means).

I would recommend implementing(using already implemented) 1, 2 in Keras and Sklearn respectively.


Label a few examples, and use classification.

Clustering is as likely to give you the clusters "images with a blueish tint", "grayscale scans" and "warm color temperature". That is a quote reasonable way to cluster such images.

Furthermore, k-means is very sensitive to outliers. And you probably have some in there.

Since you want your clusters correspond to certain human concepts, classification is what you need to use.