KNN Classification: A Complete Solved Numerical Example

Scenario: E-Commerce Purchase Prediction

The Objective: Predict whether a new website visitor will make a purchase based on their session behavior.

Step 1: The Historical Data & Target Point

To predict a categorical class, KNN Classification looks at the most similar historical data points. We are predicting the Purchased for a new target point with features: [5, 12] using K = 3.

Data PointPages_VisitedTime_on_Site_minsPurchased
P125No
P238No
P3515Yes
P4720Yes
P5618Yes
P613No
P7410No
P8825Yes
Target512?

Step 2: Calculate Euclidean Distances

First, we measure exactly how "far" our target point is from every single historical row using the Euclidean distance formula.

Formula: d=(x2x1)2+(y2y1)2+d = \sqrt{(x_2 - x_1)^2 + (y_2 - y_1)^2 + \dots}
Distance to Row P1Target Class = No

d = √((5 - 2)² + (12 - 5)²)

d = √(9 + 49)

d = √58

d = 7.6158

Distance to Row P2Target Class = No

d = √((5 - 3)² + (12 - 8)²)

d = √(4 + 16)

d = √20

d = 4.4721

Distance to Row P3Target Class = Yes

d = √((5 - 5)² + (12 - 15)²)

d = √(0 + 9)

d = √9

d = 3

Distance to Row P4Target Class = Yes

d = √((5 - 7)² + (12 - 20)²)

d = √(4 + 64)

d = √68

d = 8.2462

Distance to Row P5Target Class = Yes

d = √((5 - 6)² + (12 - 18)²)

d = √(1 + 36)

d = √37

d = 6.0828

Distance to Row P6Target Class = No

d = √((5 - 1)² + (12 - 3)²)

d = √(16 + 81)

d = √97

d = 9.8489

Distance to Row P7Target Class = No

d = √((5 - 4)² + (12 - 10)²)

d = √(1 + 4)

d = √5

d = 2.2361

Distance to Row P8Target Class = Yes

d = √((5 - 8)² + (12 - 25)²)

d = √(9 + 169)

d = √178

d = 13.3417

Step 3: Select the Top K Neighbors

We rearrange the calculated distances in ascending order (smallest to largest) and select the top K = 3 closest neighbors.

RankPointDistanceTarget Class
#1P72.2361No
#2P33Yes
#3P24.4721No
#4P56.0828Yes
#5P17.6158No
#6P48.2462Yes
#7P69.8489No
#8P813.3417Yes

Step 4: Final Classification Prediction

In KNN Classification, the final prediction is determined by a majority vote among the selected K neighbors.

Formula: Prediction=Mode(c1,c2,,ck)\text{Prediction} = \text{Mode}(c_1, c_2, \dots, c_k)
Count the Votes

Tallying the classes of the 3 nearest neighbors:

Class 'No': 2 votes

Class 'Yes': 1 vote

Prediction (Majority) = No

Final Takeaway

Because the majority of the 3 most similar visitors resulted in No, the model predicts that this new user will also follow that pattern.