Apriori Algorithm: A Complete Solved Numerical Example

Scenario: Market Basket Analysis

The Objective: Discover which book categories are frequently purchased together by ruthlessly filtering out unpopular combinations before generating recommendation rules.

Core Mechanics
  • Support is the Entry Ticket: To survive, an itemset must appear frequently enough to pass your Minimum Support threshold. If it falls short, cross it off immediately—it is permanently discarded.
  • The Pruning Rule (Apriori Property): If a small itemset is infrequent, every larger set containing it is also infrequent. If {A,B}\{A, B\} fails, you never waste time scanning the data for {A,B,C}\{A, B, C\}!
  • Level-by-Level Growth: You must build itemsets strictly one step at a time. Frequent singles combine to form pairs; surviving pairs form triplets. You only ever build size k+1k+1 using the winners from size kk.
  • Confidence Makes Rules: Finding popular itemsets is only half the job. To create an actual recommendation rule (ABA \Rightarrow B), you filter by Confidence: "Given they already have A, how likely are they to also have B?"

Step 1: The Transaction Data

The Apriori algorithm requires a list of transactions to mine frequent patterns. In this scenario, we need a Minimum Support of 50% and a Minimum Confidence of 70%.

Data PointItems (Comma-Separated)
P1A, B, C
P2A, C, D
P3B, C, E
P4A, B, C, D
P5A, C, E
P6B, D, E
Legend:A = AI & MLB = BusinessC = CodingD = Data ScienceE = Economics

Step 2: Frequent Itemset Generation

We iteratively generate Candidate Itemsets (CkC_k) and filter them by the absolute minimum support count (3) to find Frequent Itemsets (LkL_k).

Iteration 1: Finding 1-Itemsets

1. Generate Candidates (C1C_{1})Count
{A}4
{B}4
{C}5
{D}3
{E}3
2. Filter by Min Support (L1L_{1})3
{A}4
{B}4
{C}5
{D}3
{E}3

Iteration 2: Finding 2-Itemsets

1. Generate Candidates (C2C_{2})Count
{A, B}2Drop
{A, C}4
{A, D}2Drop
{A, E}1Drop
{B, C}3
{B, D}2Drop
{B, E}2Drop
{C, D}2Drop
{C, E}2Drop
{D, E}1Drop
2. Filter by Min Support (L2L_{2})3
{A, C}4
{B, C}3

Step 3: Association Rules Generation

For every frequent itemset of size 2 or more, we generate all possible rules (XYX \rightarrow Y) and calculate their confidence. If the confidence is 70%,\ge 70\%, the rule is kept.

Rule (XYX \rightarrow Y)Sup (XYX \cup Y)Sup (XX)ConfidenceStatus
{A}{C}44100 %Keep
{C}{A}4580 %Keep
{B}{C}3475 %Keep
{C}{B}3560 %Drop

Final Takeaway

Look at the final two rows in Step 3 to see the ultimate Apriori exam trap: rule asymmetry! Even though {B} → {C} passes the 70% confidence threshold and is kept, the exact reverse rule {C} → {B} scores only 60% and gets dropped, proving mathematically that association rules are strictly one-way.