Even though one is used in a supervised setting, while the other in an unsupervised manner, to me they sometimes feel like part of the same family of algorithms as they both use similar distance metrics. Mathematically speaking, what are the underlying differences between KMeans Clustering and KNearest Neighbours?

To appreciate the difference, let’s appreciate the similarities:

- Both of them will look for k neighbours of a data point
- The value of k is deterministic

K means clustering will group a data point to a certain class based on the class of it’s k nearest neighbours. It’s like being asked to join a group of people if you have k=4 common interest with them.

K Nearest Neighbours will predict that a data point belongs to a certain class given that it has k neighbours of a certain class. It’s like someone guessing that you are already friends with a bunch of people given that you have k=4 common interests with them.

In KNN, the value of k is learned to optimally classify the data. In K Means Clustering, the value of k is adjusted for optimal clustering.