Tensorflow return similar images

Tensorflow now has a nice tutorial on how to get the activations before the final layer and retrain a new classification layer with different categories: https://www.tensorflow.org/versions/master/how_tos/image_retraining/

The example code: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/image_retraining/retrain.py

In your case, yes, you can get the activations from pool_3 the layer below the softmax layer (or the so-called bottlenecks) and send them to other operations as input:

retrain imagenet

Finally, about finding similar images, I don't think imagenet's bottleneck activations are very pertinent representation for image search. You could consider to use an autoencoder network with direct image inputs.

autoencoder
(source: deeplearning4j.org)


I think I found an answer to my question:

In the file classify_image.py that classifies the image using the pre trained model (NN + classifier), I made the below mentioned changes (statements with #ADDED written next to them):

def run_inference_on_image(image):
  """Runs inference on an image.
  Args:
    image: Image file name.
  Returns:
    Nothing
  """
  if not gfile.Exists(image):
    tf.logging.fatal('File does not exist %s', image)
  image_data = gfile.FastGFile(image, 'rb').read()

  # Creates graph from saved GraphDef.
  create_graph()

with tf.Session() as sess:
 # Some useful tensors:
 # 'softmax:0': A tensor containing the normalized prediction across
 #   1000 labels.
 # 'pool_3:0': A tensor containing the next-to-last layer containing 2048
 #   float description of the image.
 # 'DecodeJpeg/contents:0': A tensor containing a string providing JPEG
 #   encoding of the image.
 # Runs the softmax tensor by feeding the image_data as input to the graph.
 softmax_tensor = sess.graph.get_tensor_by_name('softmax:0')
 feature_tensor = sess.graph.get_tensor_by_name('pool_3:0') #ADDED
 predictions = sess.run(softmax_tensor,
                        {'DecodeJpeg/contents:0': image_data})
 predictions = np.squeeze(predictions)
 feature_set = sess.run(feature_tensor,
                        {'DecodeJpeg/contents:0': image_data}) #ADDED
 feature_set = np.squeeze(feature_set) #ADDED
 print(feature_set) #ADDED
 # Creates node ID --> English string lookup.
 node_lookup = NodeLookup()

 top_k = predictions.argsort()[-FLAGS.num_top_predictions:][::-1]
 for node_id in top_k:
   human_string = node_lookup.id_to_string(node_id)
   score = predictions[node_id]
   print('%s (score = %.5f)' % (human_string, score))

I ran the pool_3:0 tensor by feeding in the image_data to it. Please let me know if I am doing a mistake. If this is correct, I believe we can use this tensor for further calculations.


Your problem sounds similar to this visual search project