Random Forest: A Complete Solved Numerical Example

Try Random Forest Solver →Read Random Forest Theory →

Scenario: Hospital Readmission Prediction

The Objective: Predict whether a discharged patient will return to the hospital by relying on the combined wisdom of multiple, randomly generated decision trees.

Core Mechanics

▼

The Wisdom of the Crowd: A Random Forest builds hundreds of independent trees, but gives each one a slightly different, randomly drawn dataset (sampled with replacement). No single tree ever sees the whole picture!
Forced Diversity: At every single split, trees are only allowed to look at a random subset of features (usually $\sqrt{d}$ ). This prevents one dominant feature from creating a forest of identical, repetitive trees.
The Final Vote: To make a prediction, every tree evaluates the data independently. The forest simply takes the majority vote for Classification tasks, or the exact average (mean) for Regression tasks.
More Trees, No Overfitting: Unlike a single decision tree, adding more trees to a forest will never cause it to overfit—the accuracy simply plateaus. The only penalty for adding more trees is slower computation!

Step 1: The Historical Data & Target Point

Random Forest improves upon a single Decision Tree by building an "ensemble" (a collection) of multiple trees. We will use this data to predict the Readmitted status for our target patient.

Data Point	Age_Group	Diagnosis_Severity	Num_Medications	Has_Support	Readmitted
P1	Young	Mild	Few	Yes	No
P2	Young	Severe	Many	No	Yes
P3	Middle	Mild	Few	Yes	No
P4	Middle	Severe	Many	Yes	Yes
P5	Senior	Mild	Many	No	Yes
P6	Senior	Severe	Few	No	Yes
P7	Middle	Mild	Many	No	No
P8	Young	Mild	Few	No	No
Target	Senior	Mild	Many	No	?

Want to edit this data in the live solver?

Step 2: Bootstrapping & Feature Selection

To ensure our trees don't all look identical, we give each tree a randomized subset of the data (Bootstrapping) and restrict which features it is allowed to split on.

Tree 1

Bootstrapped Rows:

1, 1, 3, 4, 5, 6, 6, 8

Allowed Features:

Age_GroupDiagnosis_Severity

Tree 2

Bootstrapped Rows:

2, 3, 5, 6, 7, 7, 7, 8

Allowed Features:

Num_MedicationsHas_Support

Tree 3

Bootstrapped Rows:

1, 2, 4, 5, 6, 7, 8, 8

Allowed Features:

Age_GroupHas_Support

Step 3: Tree-by-Tree Construction

Select a tab below to see how each individual tree calculates its splits using its assigned data, builds its structure, and casts its vote for the target patient.

Using Rows: 1, 1, 3, 4, 5, 6, 6, 8
Allowed Features: Age_Group, Diagnosis_Severity

Tree 1 Math Breakdown

Iteration 1

Context: Root

Current Data: Full Bootstrap Sample8 Rows

Data Point	Age_Group	Diagnosis_Severity	Readmitted
P1	Young	Mild	No
P1	Young	Mild	No
P3	Middle	Mild	No
P4	Middle	Severe	Yes
P5	Senior	Mild	Yes
P6	Senior	Severe	Yes
P6	Senior	Severe	Yes
P8	Young	Mild	No

1. Entropy of Target Class, Entropy(S)

Formula:

\text{Entropy}(S) = - \dfrac{P}{P+N} \log_2\left(\dfrac{P}{P+N}\right) - \dfrac{N}{P+N} \log_2\left(\dfrac{N}{P+N}\right)

Positives (P) for 'Yes' = 4

Negatives (N) for 'No' = 4

$\text{Entropy}(S) = 1$

2. Subset Information Required

Formula:

\text{Entropy}(P_i, N_i) = - \dfrac{P_i}{P_i+N_i} \log_2\left(\dfrac{P_i}{P_i+N_i}\right) - \dfrac{N_i}{P_i+N_i} \log_2\left(\dfrac{N_i}{P_i+N_i}\right)

Evaluating Feature: Age_Group

Value	Pi	Ni	I (Pi, Ni)
Young	0	3	0
Middle	1	1	1
Senior	3	0	0

Evaluating Feature: Diagnosis_Severity

Value	Pi	Ni	I (Pi, Ni)
Mild	1	4	0.722
Severe	3	0	0

3. Weighted Feature Entropy

Formula:

\text{Entropy}(A) = \sum \left[ \dfrac{p_i + n_i}{P + N} \right] \times \text{Entropy}(P_i, N_i)

Age_GroupEntropy = 0.25

Diagnosis_SeverityEntropy = 0.451

4. Feature Information Gain

Formula:

\text{Gain}(S, A) = \text{Entropy}(S) - \text{Entropy}(A)

Age_Group1 - 0.25 =Gain:0.75

Diagnosis_Severity1 - 0.451 =Gain:0.549

5. Feature Selection Decision

Age_Group generated the highest Information Gain (0.75).
It is selected as the optimal splitting node for this subset.

Resulting Split

Age_Group ?

Young

Class: No

Middle

Class: ?

Senior

Class: Yes

Iteration 2

Context: Age_Group = Middle

Current Data: Filtered by Age_Group = Middle2 Rows

Data Point	Age_Group	Diagnosis_Severity	Readmitted
P3	Middle	Mild	No
P4	Middle	Severe	Yes

1. Entropy of Target Class, Entropy(S)

Formula:

\text{Entropy}(S) = - \dfrac{P}{P+N} \log_2\left(\dfrac{P}{P+N}\right) - \dfrac{N}{P+N} \log_2\left(\dfrac{N}{P+N}\right)

Positives (P) for 'Yes' = 1

Negatives (N) for 'No' = 1

$\text{Entropy}(S) = 1$

2. Subset Information Required

Formula:

\text{Entropy}(P_i, N_i) = - \dfrac{P_i}{P_i+N_i} \log_2\left(\dfrac{P_i}{P_i+N_i}\right) - \dfrac{N_i}{P_i+N_i} \log_2\left(\dfrac{N_i}{P_i+N_i}\right)

Evaluating Feature: Diagnosis_Severity

Value	Pi	Ni	I (Pi, Ni)
Mild	0	1	0
Severe	1	0	0

3. Weighted Feature Entropy

Formula:

\text{Entropy}(A) = \sum \left[ \dfrac{p_i + n_i}{P + N} \right] \times \text{Entropy}(P_i, N_i)

Diagnosis_SeverityEntropy = 0

4. Feature Information Gain

Formula:

\text{Gain}(S, A) = \text{Entropy}(S) - \text{Entropy}(A)

Diagnosis_Severity1 - 0 =Gain:1

5. Feature Selection Decision

Diagnosis_Severity generated the highest Information Gain (1).
It is selected as the optimal splitting node for this subset.

Resulting Split

Diagnosis_Severity ?

Mild

Class: No

Severe

Class: Yes

Tree 1 Resulting Decision Tree

Age_Group ?

Young

Class: No

Middle

Diagnosis_Severity ?

Mild

Class: No

Severe

Class: Yes

Senior

Class: Yes

What does Tree 1 predict?

Prediction: Yes

Step 4: Forest Prediction (Majority Vote)

Each tree evaluates the Target Patient and casts a vote. The final prediction is simply the class that receives the most votes!

Voting Results

Class 'Yes': 2 votes

Class 'No': 1 vote

Final Result

Yes

Final Takeaway

Notice how Tree 2 reaches a completely different conclusion ('Prediction: No') than the other two trees because it was forced to split on a different random subset of features. This perfectly demonstrates why Random Forests prevent overfitting: even if one individual tree gets tripped up by weird data, the strict majority vote in Step 4 (2 to 1) overrules it to find the safest, most accurate prediction!