python tsne.transform does not exist?

Judging by the documentation of sklearn, TSNE simply does not have any transform method.

enter image description here

Also, TSNE is an unsupervised method for dimesionality reduction/visualization, so it does not really work with a TRAIN and TEST. You simply take all of your data and use fit_transform to have the transformation and plot it.

EDIT - It is actually not possible to learn a transformation and reuse it on different data (i.e. Train and Test), as T-sne does not learn a mapping function on a lower dimensional space, but rather runs an iterative procedure on a subspace to find an equilibrium that minimizes a loss/distance ON SOME DATA.

Therefore if you want to preprocess and reduce dimensionality of both a Train and Test datasets, the way to go is PCA/SVD or Autoencoders. T-Sne will only help you for unsupervised tasks :)


As the accepted answer says, there is no separate transform method and it probably wouldn't work in a a train/test setting.

However, you can still use TSNE without information leakage.

Training Time Calculate the TSNE per record on the training set and use it as a feature in classification algorithm.

Testing Time Append your training and testing data and fit_transform the TSNE. Now continue on processing your test set, using the TSNE as a feature on those records.

Does this cause information leakage? No.

Inference Time New records arrive e.g. as images or table rows.
Add the new row(s) to the training table, calculate TSNE (i.e. where the new sample sits in the space relative to your trained samples). Perform any other processing and run your prediction against the row.

It works fine. Sometimes, we worry too much about train/test split because of Kaggle etc. But the main thing is can your method be replicated at inference time and with the same expected accuracy for live use. In this case, yes it can!

Only drawback is you need your training database available at inference time and depending on size, the preprocessing might be costly.