K-Means Clustering: A Complete Solved Numerical Example

Scenario: Mobile App User Segmentation

The Objective: Segment mobile app users into distinct behavioral groups based on daily session count and average session duration.

Core Mechanics
  • Batch Assignment First: Compare every point to its nearest centroid before you do anything else. Do not move or update any centroids until every single point has been assigned to its closest match.
  • Recenter to the Mean: Only after the full assignment is complete, shift each centroid to the exact center (the average) of its newly assigned points. This is the only time the centroids move.
  • Strict Iterative Sequence: Always follow the cycle: Assign → Update → Check. Mixing up this sequence is the fastest way to lose points on a trace table.
  • Convergence (The Exit Rule): You are finished when the assignments in the current iteration perfectly match the assignments from the previous one. If even one point changes its cluster, run it again!

Step 1: The Dataset & Initial Centroids

K-Means is an unsupervised learning algorithm, meaning there is no "Target" class to predict. Instead, it groups data points into K distinct clusters.

Data PointDaily_SessionsAvg_Session_Duration_mins
P115
P226
P314
P4820
P5922
P6819
P7512
P8925
Starting Centroids (Iteration 0)
C1 (p1): [1, 5]
C2 (p4): [8, 20]

Step 2: The Iterative Process

The algorithm alternates between two steps until the clusters stop changing: Assigning points to the nearest centroid, and recalculating the centroid to be the geometric center of its new cluster.

Iteration 1

Starting:C1 [1, 5]C2 [8, 20]
A. Calculate Euclidean Distances & Assign Clusters
Formula: d=(X2X1)2+(Y2Y1)2d = \sqrt{(X_2 - X_1)^2 + (Y_2 - Y_1)^2}
PointCoordinatesDist to C1Dist to C2Assigned To
P1[1, 5]016.553C1
P2[2, 6]1.41415.232C1
P3[1, 4]117.464C1
P4[8, 20]16.5530C2
P5[9, 22]18.7882.236C2
P6[8, 19]15.6521C2
P7[5, 12]8.0628.544C1
P8[9, 25]21.5415.099C2
B. Calculate New Centroids (Means)
Formula: Cnew=(x1+x2++xnN,y1+y2++ynN)C_{new} = \left( \dfrac{x_1 + x_2 + \dots + x_n}{N}, \dfrac{y_1 + y_2 + \dots + y_n}{N} \right)
Cluster C14 points
Points:P1, P2, P3, P7
Dim 1:(1 + 2 + 1 + 5) / 4
= 2.25
Dim 2:(5 + 6 + 4 + 12) / 4
= 6.75
Cluster C24 points
Points:P4, P5, P6, P8
Dim 1:(8 + 9 + 8 + 9) / 4
= 8.5
Dim 2:(20 + 22 + 19 + 25) / 4
= 21.5

Iteration 2

Starting:C1 [2.25, 6.75]C2 [8.5, 21.5]
A. Calculate Euclidean Distances & Assign Clusters
Formula: d=(X2X1)2+(Y2Y1)2d = \sqrt{(X_2 - X_1)^2 + (Y_2 - Y_1)^2}
PointCoordinatesDist to C1Dist to C2Assigned To
P1[1, 5]2.15118.125C1
P2[2, 6]0.79116.808C1
P3[1, 4]3.02119.039C1
P4[8, 20]14.4441.581C2
P5[9, 22]16.6770.707C2
P6[8, 19]13.5322.55C2
P7[5, 12]5.92710.124C1
P8[9, 25]19.4583.536C2
B. Calculate New Centroids (Means)
Formula: Cnew=(x1+x2++xnN,y1+y2++ynN)C_{new} = \left( \dfrac{x_1 + x_2 + \dots + x_n}{N}, \dfrac{y_1 + y_2 + \dots + y_n}{N} \right)
Cluster C14 points
Points:P1, P2, P3, P7
Dim 1:(1 + 2 + 1 + 5) / 4
= 2.25
Dim 2:(5 + 6 + 4 + 12) / 4
= 6.75
Cluster C24 points
Points:P4, P5, P6, P8
Dim 1:(8 + 9 + 8 + 9) / 4
= 8.5
Dim 2:(20 + 22 + 19 + 25) / 4
= 21.5

Step 3: Convergence & Final Result

The algorithm stops when the cluster assignments no longer change between iterations.

The algorithm successfully converged after 2 iterations.

Final Cluster 1

Centroid: [2.25, 6.75]

Points: P1, P2, P3, P7

Final Cluster 2

Centroid: [8.5, 21.5]

Points: P4, P5, P6, P8

Final Takeaway

Notice how the shifted centroids in Iteration 2 caused all Euclidean distances to change, yet not a single point traded teams between C1 and C2! Because the cluster assignments remained perfectly stable, the algorithm immediately converges, proving the golden rule: K-Means stops based on group stability, not when distances reach zero.