Skip to content

Implement Apriori Algorithm for Association Rule Mining #21

@noahgift

Description

@noahgift

Problem Statement

Apriori discovers frequent itemsets and association rules in transactional data. Used for market basket analysis. Currently missing from aprender.

Use Cases:

  • Market basket analysis ("customers who bought X also bought Y")
  • Recommendation systems
  • Cross-selling strategies
  • Web usage mining

Example Rules:

  • {milk, bread} → {butter} (support=0.3, confidence=0.8)
  • "30% of transactions contain milk, bread, butter"
  • "80% of transactions with milk and bread also have butter"

Proposed Solution

Implement Apriori algorithm following EXTREME TDD.

Algorithm

Steps:

  1. Find frequent 1-itemsets (items above min_support)
  2. Generate candidate k-itemsets from frequent (k-1)-itemsets
  3. Prune candidates using Apriori principle:
    • If itemset infrequent, all supersets are infrequent
  4. Generate association rules from frequent itemsets
  5. Filter rules by min_confidence

Implementation

API Design:

pub struct Apriori {
    min_support: f32,
    min_confidence: f32,
    frequent_itemsets: Option<Vec<ItemSet>>,
    rules: Option<Vec<AssociationRule>>,
}

pub struct ItemSet {
    items: Vec<usize>,
    support: f32,
}

pub struct AssociationRule {
    antecedent: Vec<usize>,  // If
    consequent: Vec<usize>,  // Then
    support: f32,
    confidence: f32,
    lift: f32,
}

impl Apriori {
    pub fn fit(&mut self, transactions: &[Vec<usize>]) -> Result<(), &'static str>;
    pub fn frequent_itemsets(&self) -> &[ItemSet];
    pub fn association_rules(&self) -> &[AssociationRule];
}

Success Criteria

  • ✅ Apriori with frequent itemset mining
  • ✅ Association rule generation
  • ✅ Support, confidence, lift metrics
  • ✅ 10+ tests (including retail dataset)
  • ✅ Zero clippy warnings
  • ✅ Example: examples/market_basket.rs

Estimated Effort

Timeline: 3-4 days
Complexity: Medium (combinatorial enumeration, pruning)

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions