Measure similarity between two documents using Doc2Vec

Hello just In case someone is interested, to do this you just need the cosine distance between the two vectors.

I found that most people are using 'spatial' for this pourpose

Here is a small code sniped that should work pretty well if you already have trained doc2vec

from gensim.models import doc2vec
from scipy import spatial

d2v_model = doc2vec.Doc2Vec.load(model_file)

fisrt_text = '..'
second_text = '..'

vec1 = d2v_model.infer_vector(fisrt_text.split())
vec2 = d2v_model.infer_vector(second_text.split())

cos_distance = spatial.distance.cosine(vec1, vec2)
# cos_distance indicates how much the two texts differ from each other:
# higher values mean more distant (i.e. different) texts