Overview
Implement Phase 1 of the collaborative filtering recommendation system as specified in docs/specs/collab-filter-spec.md v1.1.
Scope
Components to Implement:
-
HNSW Index (src/index/hnsw.rs)
- Hierarchical Navigable Small World graph structure
- O(log n × d) approximate nearest neighbor search
- Multi-layer skip-list navigation
- Configurable M (max connections) and efConstruction parameters
-
Incremental IDF Tracker (src/text/incremental_idf.rs)
- Streaming IDF updates with exponential decay
- Avoids IDF drift in production (Toyota Way Jidoka requirement)
- Integration with existing TF-IDF vectorizer
-
Content-Based Recommender (src/recommend/content_based.rs)
- Item-to-item similarity using HNSW index
- Integration with incremental IDF for TF-IDF features
- <100ms query latency for 1M items (benchmark requirement)
Quality Requirements
- Test Coverage: ≥95% (property tests + unit tests)
- Property Tests:
- HNSW graph connectivity invariants
- Search quality (recall ≥95% vs brute force)
- IDF monotonicity (document frequency increases → IDF decreases)
- Benchmarks:
benches/recommend.rs
- Query time <100ms for 1M items
- Indexing time O(n log n × d)
- Example:
examples/recommend_content.rs
- Documentation: Book chapter
book/src/examples/content-recommender.md
Timeline
Estimate: 3-5 days (HIGH priority, Week 1)
References
- [11] Indyk & Motwani (1998) - Locality-Sensitive Hashing
- [12] Malkov & Yashunin (2018) - Efficient and robust approximate nearest neighbor search using HNSW
- [15] Sculley et al. (2015) - Hidden Technical Debt in Machine Learning Systems
Acceptance Criteria
Overview
Implement Phase 1 of the collaborative filtering recommendation system as specified in
docs/specs/collab-filter-spec.mdv1.1.Scope
Components to Implement:
HNSW Index (
src/index/hnsw.rs)Incremental IDF Tracker (
src/text/incremental_idf.rs)Content-Based Recommender (
src/recommend/content_based.rs)Quality Requirements
benches/recommend.rsexamples/recommend_content.rsbook/src/examples/content-recommender.mdTimeline
Estimate: 3-5 days (HIGH priority, Week 1)
References
Acceptance Criteria
make tier3)