Random Forest: A Complete Solved Numerical Example
Scenario: Hospital Readmission Prediction
The Objective: Predict whether a discharged patient will return to the hospital by relying on the combined wisdom of multiple, randomly generated decision trees.
Core Mechanics▼
- The Wisdom of the Crowd: A Random Forest builds hundreds of independent trees, but gives each one a slightly different, randomly drawn dataset (sampled with replacement). No single tree ever sees the whole picture!
- Forced Diversity: At every single split, trees are only allowed to look at a random subset of features (usually ). This prevents one dominant feature from creating a forest of identical, repetitive trees.
- The Final Vote: To make a prediction, every tree evaluates the data independently. The forest simply takes the majority vote for Classification tasks, or the exact average (mean) for Regression tasks.
- More Trees, No Overfitting: Unlike a single decision tree, adding more trees to a forest will never cause it to overfit—the accuracy simply plateaus. The only penalty for adding more trees is slower computation!
Step 1: The Historical Data & Target Point
Random Forest improves upon a single Decision Tree by building an "ensemble" (a collection) of multiple trees. We will use this data to predict the Readmitted status for our target patient.
| Data Point | Age_Group | Diagnosis_Severity | Num_Medications | Has_Support | Readmitted |
|---|---|---|---|---|---|
| P1 | Young | Mild | Few | Yes | No |
| P2 | Young | Severe | Many | No | Yes |
| P3 | Middle | Mild | Few | Yes | No |
| P4 | Middle | Severe | Many | Yes | Yes |
| P5 | Senior | Mild | Many | No | Yes |
| P6 | Senior | Severe | Few | No | Yes |
| P7 | Middle | Mild | Many | No | No |
| P8 | Young | Mild | Few | No | No |
| Target | Senior | Mild | Many | No | ? |
Step 2: Bootstrapping & Feature Selection
To ensure our trees don't all look identical, we give each tree a randomized subset of the data (Bootstrapping) and restrict which features it is allowed to split on.
Tree 1
Tree 2
Tree 3
Step 3: Tree-by-Tree Construction
Select a tab below to see how each individual tree calculates its splits using its assigned data, builds its structure, and casts its vote for the target patient.
Allowed Features: Age_Group, Diagnosis_Severity
Tree 1 Math Breakdown
Iteration 1
Context: Root
| Data Point | Age_Group | Diagnosis_Severity | Readmitted |
|---|---|---|---|
| P1 | Young | Mild | No |
| P1 | Young | Mild | No |
| P3 | Middle | Mild | No |
| P4 | Middle | Severe | Yes |
| P5 | Senior | Mild | Yes |
| P6 | Senior | Severe | Yes |
| P6 | Senior | Severe | Yes |
| P8 | Young | Mild | No |
1. Entropy of Target Class, Entropy(S)
Positives (P) for 'Yes' = 4
Negatives (N) for 'No' = 4
2. Subset Information Required
| Value | Pi | Ni | I (Pi, Ni) |
|---|---|---|---|
| Young | 0 | 3 | 0 |
| Middle | 1 | 1 | 1 |
| Senior | 3 | 0 | 0 |
| Value | Pi | Ni | I (Pi, Ni) |
|---|---|---|---|
| Mild | 1 | 4 | 0.722 |
| Severe | 3 | 0 | 0 |
3. Weighted Feature Entropy
4. Feature Information Gain
5. Feature Selection Decision
Age_Group generated the highest Information Gain (0.75).
It is selected as the optimal splitting node for this subset.
Iteration 2
Context: Age_Group = Middle
| Data Point | Age_Group | Diagnosis_Severity | Readmitted |
|---|---|---|---|
| P3 | Middle | Mild | No |
| P4 | Middle | Severe | Yes |
1. Entropy of Target Class, Entropy(S)
Positives (P) for 'Yes' = 1
Negatives (N) for 'No' = 1
2. Subset Information Required
| Value | Pi | Ni | I (Pi, Ni) |
|---|---|---|---|
| Mild | 0 | 1 | 0 |
| Severe | 1 | 0 | 0 |
3. Weighted Feature Entropy
4. Feature Information Gain
5. Feature Selection Decision
Diagnosis_Severity generated the highest Information Gain (1).
It is selected as the optimal splitting node for this subset.
Tree 1 Resulting Decision Tree
Step 4: Forest Prediction (Majority Vote)
Each tree evaluates the Target Patient and casts a vote. The final prediction is simply the class that receives the most votes!
Voting Results
Class 'Yes': 2 votes
Class 'No': 1 vote
Final Takeaway
Notice how Tree 2 reaches a completely different conclusion ('Prediction: No') than the other two trees because it was forced to split on a different random subset of features. This perfectly demonstrates why Random Forests prevent overfitting: even if one individual tree gets tripped up by weird data, the strict majority vote in Step 4 (2 to 1) overrules it to find the safest, most accurate prediction!