Skip to content

Add Graph and Statistics Examples (cargo examples + book chapters) #10

@noahgift

Description

@noahgift

Problem Statement

The newly implemented graph and statistics modules (Issue #9) lack user-facing examples and documentation:

Missing Components:

  • ❌ No examples/ directory demos for graph algorithms
  • ❌ No examples/ directory demos for descriptive statistics
  • ❌ No book chapters explaining graph theory
  • ❌ No book chapters explaining descriptive statistics
  • ❌ No real-world use cases documented

Impact: Users cannot discover or learn how to use the powerful graph and statistics features added in v0.3.1.

Proposed Solution

Add comprehensive examples and documentation following the established pattern from other modules.

Cargo Examples (examples/ directory)

  1. examples/graph_social_network.rs - Social network analysis

    • Build friend network graph
    • Calculate degree centrality (most connected people)
    • Run PageRank (influence scores)
    • Compute betweenness centrality (bridges between communities)
    • Visualize results with clear output
  2. examples/descriptive_statistics.rs - Data analysis workflow

    • Load sample dataset (e.g., test scores, sensor readings)
    • Compute quantiles, percentiles, five-number summary
    • Generate histograms with different binning methods
    • Compare FreedmanDiaconis vs Sturges vs Scott rules
    • Interpret results for outlier detection

Book Chapters

  1. book/src/ml-fundamentals/graph-algorithms.md - Graph theory chapter

    • CSR representation benefits
    • Degree centrality (Freeman normalization)
    • PageRank theory (power iteration, Kahan summation)
    • Betweenness centrality (Brandes algorithm)
    • Real-world applications (social networks, web crawling, supply chains)
    • Code examples with explanations
  2. book/src/ml-fundamentals/descriptive-statistics.md - Statistics theory chapter

    • Quantile methods (R-7 interpolation)
    • Five-number summary and IQR
    • Histogram binning strategies
    • QuickSelect optimization (O(n) vs O(n log n))
    • Use cases (EDA, outlier detection, data profiling)
    • Code examples with explanations
  3. book/src/examples/graph-social-network.md - Case study

    • Full walkthrough of social network example
    • Interpretation of centrality scores
    • Performance notes (CSR benefits, parallel betweenness)
  4. book/src/examples/descriptive-statistics.md - Case study

    • Full walkthrough of statistics example
    • Histogram method selection guidance
    • When to use each binning strategy
  5. Update book/src/SUMMARY.md - Add new chapters to TOC

    • Add to "Machine Learning Fundamentals" section
    • Add to "Real-World Examples" section

Technical Requirements

Cargo Examples:

  • Runnable with cargo run --example <name>
  • Clear console output with labeled sections
  • Real-world datasets (embedded or generated)
  • Comments explaining key steps
  • Performance notes where relevant

Book Chapters:

  • Theory sections with mathematical notation
  • Code blocks must be tested (mdbook test)
  • Real-world applications highlighted
  • Links to API documentation
  • References to peer-reviewed papers (from spec)

Success Criteria

  • ✅ 2 runnable cargo examples (graph_social_network, descriptive_statistics)
  • ✅ 4 new book chapters (2 theory + 2 case studies)
  • ✅ SUMMARY.md updated with new chapters
  • ✅ All code blocks in book pass mdbook test
  • ✅ Examples tested in CI
  • ✅ Zero clippy warnings
  • ✅ Clear, educational content following EXTREME TDD book style

Benefits

User Experience:

  • Discoverability of graph/stats features
  • Learn-by-example approach
  • Clear real-world use cases

Completeness:

  • Graph/stats modules at parity with other modules (all have examples + book chapters)
  • Professional documentation matching production code quality

Education:

  • Teach graph theory and statistics fundamentals
  • Demonstrate Toyota Way optimizations in practice

Estimated Effort

Timeline: 1-2 days

  • ~150 lines per cargo example (2 examples)
  • ~400-600 lines per theory chapter (2 chapters)
  • ~300-400 lines per case study (2 chapters)
  • Testing and validation

Complexity: Medium (requires clear explanations and real datasets)

References

  • Existing examples: examples/iris_clustering.rs, examples/boston_housing.rs
  • Existing book chapters: book/src/examples/kmeans-clustering.md, book/src/ml-fundamentals/linear-regression.md
  • Implementation: src/graph/mod.rs, src/stats/mod.rs
  • Specification: docs/specifications/graph-traditional-descriptive-statistics-spec.md v1.1.0

Acceptance Criteria

  • examples/graph_social_network.rs implemented and tested
  • examples/descriptive_statistics.rs implemented and tested
  • book/src/ml-fundamentals/graph-algorithms.md written
  • book/src/ml-fundamentals/descriptive-statistics.md written
  • book/src/examples/graph-social-network.md written
  • book/src/examples/descriptive-statistics.md written
  • book/src/SUMMARY.md updated
  • cargo run --example graph_social_network works
  • cargo run --example descriptive_statistics works
  • mdbook test passes
  • All examples in CI
  • Zero clippy warnings

Metadata

Metadata

Assignees

No one assigned

    Labels

    documentationImprovements or additions to documentationenhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions