Naive Bayes Classifier: A Complete Solved Numerical Example
Scenario: Medical Symptom Diagnosis
The Objective: Predict whether a patient has a viral infection based on reported clinical symptoms.
Step 1: The Training Data
Before we can classify the new target point, we must analyze our historical data.
| Data Point | Fever | Fatigue | Cough | Headache | Diagnosis |
|---|---|---|---|---|---|
| P1 | High | Yes | Yes | Yes | Viral |
| P2 | High | Yes | No | Yes | Viral |
| P3 | Normal | No | Yes | No | Healthy |
| P4 | Normal | No | No | No | Healthy |
| P5 | High | No | Yes | Yes | Viral |
| P6 | Normal | Yes | No | Yes | Healthy |
| P7 | High | Yes | Yes | No | Viral |
| P8 | Normal | No | Yes | Yes | Healthy |
| P9 | Normal | Yes | Yes | No | Healthy |
| P10 | High | Yes | No | No | Viral |
| Target | High | Yes | Yes | Yes | ? |
Step 2: Prior Probabilities P (Class)
First, we calculate the baseline probability of each class occurring in our entire dataset before looking at any specific symptoms.
= 0.5000
= 0.5000
Step 3: Conditional Probabilities P (Feature | Class)
Next, we look at our target point and calculate the "likelihood" of each specific feature appearing, assuming a specific class is true. We do this by isolating the rows for a specific class and counting how many times the feature matches.
Step 4: Final Probabilities & Conclusion
Finally, we apply Bayes' Theorem (using the naive independence assumption). We multiply the Prior Probability (Step 2) by all the Conditional Probabilities (Step 3) for each class. The class that yields the highest score is our prediction.
= 5/10 * 5/5 * 4/5 * 3/5 * 3/5
= 0.1440
= 5/10 * 0/5 * 2/5 * 3/5 * 2/5
= 0.0000
Final Prediction
Because Viral has the highest final calculated probability, the Naive Bayes classifier predicts that the target point belongs to the Viral class.