What does the KNN algorithm do in the training phase?

During training phase, KNN arranges the data (sort of indexing process) in order to find the closest neighbors efficiently during the inference phase. Otherwise, it would have to compare each new case during inference with the whole dataset making it quite inefficient.

You can read more about it at: https://scikit-learn.org/stable/modules/neighbors.html#nearest-neighbor-algorithms


KNN belongs to the group of lazy learners. As opposed to eager learners such as logistic regression, svms, neural nets, lazy learners just store the training data in memory. Then, during inference, it find the K nearest neighbours from the training data in order to classify the new instance.


KNN is an instance based method, which completely relies on training examples, in other words, it memorizes all the training examples So in case of classification, whenever any examples appears, it compute euclidean distance between the input example and all the training examples, and returns the label of the closest training example based on the distance.