KNN Classification: A Complete Solved Numerical Example

Try K-Nearest Neighbors Solver →Read K-Nearest Neighbors Theory →

Scenario: E-Commerce Purchase Prediction

The Objective: Predict whether a new website visitor will make a purchase by mathematically comparing their session behavior to your database of past customers.

Core Mechanics

▼

The Lazy Learner: KNN does zero computation at training time; it simply memorizes the data. Because it has no model to "learn," it must scan the entire dataset for every single prediction you ask it to make.
Feature Sensitivity: KNN relies entirely on distance. If your features have different scales (e.g., age vs. income), the large-scale feature will completely dominate the calculation. Always normalize your data first!
The Curse of Dimensionality: As you add more features, the concept of "nearest" starts to break down because points become sparse. In high-dimensional spaces, distance metrics lose their meaning and KNN accuracy drops.
The $K$ Bias/Variance Tradeoff: Small $K$ values overfit to local noise (high variance). Large $K$ values smooth over real boundaries (high bias). Always pick an odd $K$ to guarantee a clear winner.

Step 1: The Historical Data & Target Point

To predict a categorical class, KNN Classification looks at the most similar historical data points. We are predicting the Purchased for a new target point with features: [5, 12] using K = 3.

Data Point	Pages_Visited	Time_on_Site_mins	Purchased
P1	2	5	No
P2	3	8	No
P3	5	15	Yes
P4	7	20	Yes
P5	6	18	Yes
P6	1	3	No
P7	4	10	No
P8	8	25	Yes
Target	5	12	?

Want to edit this data in the live solver?

Step 2: Calculate Euclidean Distances

First, we measure exactly how "far" our target point is from every single historical row using the Euclidean distance formula.

Formula:

d = \sqrt{(x_2 - x_1)^2 + (y_2 - y_1)^2 + \dots}

Distance to Row P1Target Class = No

d = √((5 - 2)² + (12 - 5)²)

d = √(9 + 49)

d = √58

d = 7.616

Distance to Row P2Target Class = No

d = √((5 - 3)² + (12 - 8)²)

d = √(4 + 16)

d = √20

d = 4.472

Distance to Row P3Target Class = Yes

d = √((5 - 5)² + (12 - 15)²)

d = √(0 + 9)

d = √9

d = 3

Distance to Row P4Target Class = Yes

d = √((5 - 7)² + (12 - 20)²)

d = √(4 + 64)

d = √68

d = 8.246

Distance to Row P5Target Class = Yes

d = √((5 - 6)² + (12 - 18)²)

d = √(1 + 36)

d = √37

d = 6.083

Distance to Row P6Target Class = No

d = √((5 - 1)² + (12 - 3)²)

d = √(16 + 81)

d = √97

d = 9.849

Distance to Row P7Target Class = No

d = √((5 - 4)² + (12 - 10)²)

d = √(1 + 4)

d = √5

d = 2.236

Distance to Row P8Target Class = Yes

d = √((5 - 8)² + (12 - 25)²)

d = √(9 + 169)

d = √178

d = 13.342

Step 3: Select the Top K Neighbors

We rearrange the calculated distances in ascending order (smallest to largest) and select the top K = 3 closest neighbors.

Rank	Point	Distance	Target Class
#1	P7	2.236	No
#2	P3	3	Yes
#3	P2	4.472	No
#4	P5	6.083	Yes
#5	P1	7.616	No
#6	P4	8.246	Yes
#7	P6	9.849	No
#8	P8	13.342	Yes

Step 4: Final Classification Prediction

In KNN Classification, the final prediction is determined by a majority vote among the selected K neighbors.

Formula:

\text{Prediction} = \text{Mode}(c_1, c_2, \dots, c_k)

Count the Votes

Tallying the classes of the 3 nearest neighbors:

Class 'No': 2 votes

Class 'Yes': 1 vote

Prediction (Majority) = No

Final Takeaway

Notice in Step 3 how KNN must calculate and sort the distance to every single historical row before making a decision, which is why it is considered computationally heavy. In the final tally, even though the new visitor is highly similar to a 'Yes' customer (P3), the two 'No' neighbors (P7 and P2) outnumber it, proving how strict majority rules dictate the final prediction!