Core Algorithms7 min read

K-Nearest Neighbours (KNN)

Ask your closest neighbours, go with the majority

training:O(1) — literally stores the data, nothing elseprediction:O(n * d) — compares against every stored pointmemory:O(n * d) — keeps entire dataset in memory

You just moved to a new city and you're starving. You don't know any restaurants, so you knock on the doors of your 5 nearest neighbours and ask each one: "What's the best restaurant around here?"

Three of them say "Mario's Pizza." One says "Thai Palace." One says "Burger Barn." You go with the majority — Mario's Pizza it is.

That's KNN. When it needs to classify a new data point, it looks at the K closest data points in the training set and lets them vote. The majority class wins.

No training? Really?

Here's what's wild about KNN: there is no training step. The model just memorizes the entire dataset. When a new point arrives, it calculates the distance to every single stored point, finds the K closest ones, and takes a vote.

This makes KNN a "lazy learner" — it doesn't learn an equation or build a tree ahead of time. It does all the work at prediction time.

How do we measure "closeness"?

The most common way is Euclidean distance — the straight-line distance between two points. Imagine two houses plotted on a map. The distance between them is just the length of a line drawn from one to the other.

For two features (x1, y1) and (x2, y2):

distance = sqrt((x2 - x1)^2 + (y2 - y1)^2)

It's the Pythagorean theorem, dressed up for data science.

Choosing K: the magic number

The choice of K changes everything:

K = 1: Just copy the nearest neighbour. Very sensitive to noise — one weird data point can throw everything off.
K = 3 or 5: The sweet spot for many problems. Smooths out noise while staying responsive.
K = 100: Very smooth predictions, but might miss local patterns. Like asking 100 neighbours — you get the city's average opinion, not your block's opinion.

Pro tip: always use an odd K for binary classification. If K = 4 and you get a 2-2 tie... awkward.

KNN from Scratch

import math
from collections import Counter

def euclidean(a, b):
    return math.sqrt(sum((x - y) ** 2 for x, y in zip(a, b)))

def knn_predict(train_X, train_y, new_point, k=3):
    # Calculate distance to every training point
    distances = []
    for i, x in enumerate(train_X):
        d = euclidean(x, new_point)
        distances.append((d, train_y[i]))
    
    # Sort by distance and pick K nearest
    distances.sort()
    k_nearest = [label for _, label in distances[:k]]
    
    # Majority vote
    vote = Counter(k_nearest).most_common(1)[0][0]
    return vote

# Fruit data: [weight_g, sweetness]
train_X = [[150, 7], [170, 8], [130, 3], [140, 2], [180, 9], [120, 4]]
train_y = ['apple', 'apple', 'lemon', 'lemon', 'apple', 'lemon']

mystery = [155, 6]
print(f"Mystery fruit: {knn_predict(train_X, train_y, mystery, k=3)}")

Output

Mystery fruit: apple

Note: KNN is heavily affected by feature scale. If one feature ranges from 0-1000 (like salary) and another from 0-5 (like rating), the salary will dominate distance calculations. Always normalize your features before using KNN!