KNN Classification: A Complete Solved Numerical Example
Scenario: E-Commerce Purchase Prediction
The Objective: Predict whether a new website visitor will make a purchase based on their session behavior.
Step 1: The Historical Data & Target Point
To predict a categorical class, KNN Classification looks at the most similar historical data points. We are predicting the Purchased for a new target point with features: [5, 12] using K = 3.
| Data Point | Pages_Visited | Time_on_Site_mins | Purchased |
|---|---|---|---|
| P1 | 2 | 5 | No |
| P2 | 3 | 8 | No |
| P3 | 5 | 15 | Yes |
| P4 | 7 | 20 | Yes |
| P5 | 6 | 18 | Yes |
| P6 | 1 | 3 | No |
| P7 | 4 | 10 | No |
| P8 | 8 | 25 | Yes |
| Target | 5 | 12 | ? |
Step 2: Calculate Euclidean Distances
First, we measure exactly how "far" our target point is from every single historical row using the Euclidean distance formula.
d = √((5 - 2)² + (12 - 5)²)
d = √(9 + 49)
d = √58
d = 7.6158
d = √((5 - 3)² + (12 - 8)²)
d = √(4 + 16)
d = √20
d = 4.4721
d = √((5 - 5)² + (12 - 15)²)
d = √(0 + 9)
d = √9
d = 3
d = √((5 - 7)² + (12 - 20)²)
d = √(4 + 64)
d = √68
d = 8.2462
d = √((5 - 6)² + (12 - 18)²)
d = √(1 + 36)
d = √37
d = 6.0828
d = √((5 - 1)² + (12 - 3)²)
d = √(16 + 81)
d = √97
d = 9.8489
d = √((5 - 4)² + (12 - 10)²)
d = √(1 + 4)
d = √5
d = 2.2361
d = √((5 - 8)² + (12 - 25)²)
d = √(9 + 169)
d = √178
d = 13.3417
Step 3: Select the Top K Neighbors
We rearrange the calculated distances in ascending order (smallest to largest) and select the top K = 3 closest neighbors.
| Rank | Point | Distance | Target Class |
|---|---|---|---|
| #1 | P7 | 2.2361 | No |
| #2 | P3 | 3 | Yes |
| #3 | P2 | 4.4721 | No |
| #4 | P5 | 6.0828 | Yes |
| #5 | P1 | 7.6158 | No |
| #6 | P4 | 8.2462 | Yes |
| #7 | P6 | 9.8489 | No |
| #8 | P8 | 13.3417 | Yes |
Step 4: Final Classification Prediction
In KNN Classification, the final prediction is determined by a majority vote among the selected K neighbors.
Tallying the classes of the 3 nearest neighbors:
Class 'No': 2 votes
Class 'Yes': 1 vote
Prediction (Majority) = No
Final Takeaway
Because the majority of the 3 most similar visitors resulted in No, the model predicts that this new user will also follow that pattern.