Confusion Matrix: A Complete Solved Numerical Example

Try Confusion Matrix Solver →Read Confusion Matrix Theory →

Scenario: Fraudulent Transaction Detection

The Objective: A bank's fraud detection model was evaluated on 200 recent transactions. The results are recorded below.

Core Mechanics

▼

Decoding the Acronyms: Read them backward! The second letter (P or N) is what your model guessed. The first letter (T or F) tells you if that guess was True (Right) or False (Wrong).
The Accuracy Trap: High accuracy is dangerously misleading if your classes are imbalanced. A lazy model can reach 99% accuracy by just guessing the majority class every single time, while missing every single rare case you actually care about.
Precision (Quality Control): Of everything your model claimed was Positive, how many actually were? It measures how much you can trust your model when it screams "Positive!"—too many false alarms destroy this score.
Recall (The Dragnet): Of all the actual Positives hidden in the real data, how many did you catch? Recall is about your model's ability to hunt down every single target; missing real ones destroys this score.

Step 1: The Confusion Matrix

Here are the raw results from evaluating the model.Positive Class = Fraudulent | Negative Class = Legitimate

Predicted Positive

Predicted Negative

Actual Positive

True Positive (TP)

False Negative (FN)

Actual Negative

False Positive (FP)

True Negative (TN)

135

Want to edit this data in the live solver?

Step 2: Step-by-Step Calculation Breakdown

Here is exactly how each metric was derived using the core confusion matrix formulas.

Accuracy

Out of all predictions, how many were perfectly correct?

87.5 %

Formula String

\frac{TP + TN}{TP + TN + FP + FN} \Rightarrow \frac{40 + 135}{40 + 135 + 15 + 10}

→

Execution

\frac{175}{200} = 0.875

Precision

When the AI predicted 'Yes', how often was it actually right?

72.727 %

Formula String

\frac{TP}{TP + FP} \Rightarrow \frac{40}{40 + 15}

→

Execution

\frac{40}{55} = 0.727

Recall (Sensitivity)

Out of all the actual 'Yes' cases, how many did the AI successfully find?

80 %

Formula String

\frac{TP}{TP + FN} \Rightarrow \frac{40}{40 + 10}

→

Execution

\frac{40}{50} = 0.8

F1 Score

The harmonic mean. It forces a balance between Precision and Recall.

76.19 %

Formula String

2 \times \frac{Precision \times Recall}{Precision + Recall} \Rightarrow 2 \times \frac{0.727 \times 0.8}{0.727 + 0.8}

→

Execution

2 \times \frac{0.582}{1.527} = 0.762

Step 3: Evaluation Metrics Summary

Using the values above, we calculate the four primary performance metrics for this classification model.

Accuracy

87.5 %

Precision

72.727 %

Recall

80 %

F1 Score

76.19 %

Final Takeaway

The positive class represents a Fraudulent transaction. A False Negative (missed fraud) is significantly more costly than a False Positive (a legitimate transaction flagged for review). Notice the tension between the metrics! Because the model's Precision (72.7%) and Recall (80%) are noticeably different, you cannot just add them up and divide by two. You must calculate the F1 Score (76.19%) to find their true harmonic mean, which is a classic exam trap!