Summary
Add Reciprocal Rank Fusion (RRF) as a rank aggregation option in memory/ranking.py for merging ranked lists from multiple retrieval rounds or sources.
Motivation
Two independent sources (NVIDIA NeMo Retriever, Agentic RAG/TDS) recommend RRF over linear weighted combination for fusing results from different retrieval strategies. Qdrant's own analysis explicitly rejects linear fusion (H = (1-a)*K + a*V) because "relevant and non-relevant items are mixed" in 2D score space. RRF is the de facto standard for hybrid search fusion.
Current ranking uses a linear combination (0.7*relevance + 0.3*recency). This works for single-signal ranking but is insufficient when:
- BM25/sparse search is added alongside vector search
- Multiple retrieval rounds produce separate ranked lists (tool-based injection)
Design
~30-line algorithm: RRF_score(doc) = sum(1 / (k + rank_i)) across all ranking sources.
Add as an alternative to the current compute_combined_score in memory/ranking.py. Use RRF when multiple ranked lists need merging; keep linear combination for single-source relevance+recency scoring.
Affected Files
src/ai_company/memory/ranking.py (add RRF function)
src/ai_company/memory/retrieval_config.py (config for fusion strategy)
Research
Summary
Add Reciprocal Rank Fusion (RRF) as a rank aggregation option in
memory/ranking.pyfor merging ranked lists from multiple retrieval rounds or sources.Motivation
Two independent sources (NVIDIA NeMo Retriever, Agentic RAG/TDS) recommend RRF over linear weighted combination for fusing results from different retrieval strategies. Qdrant's own analysis explicitly rejects linear fusion (
H = (1-a)*K + a*V) because "relevant and non-relevant items are mixed" in 2D score space. RRF is the de facto standard for hybrid search fusion.Current ranking uses a linear combination (
0.7*relevance + 0.3*recency). This works for single-signal ranking but is insufficient when:Design
~30-line algorithm:
RRF_score(doc) = sum(1 / (k + rank_i))across all ranking sources.Add as an alternative to the current
compute_combined_scoreinmemory/ranking.py. Use RRF when multiple ranked lists need merging; keep linear combination for single-source relevance+recency scoring.Affected Files
src/ai_company/memory/ranking.py(add RRF function)src/ai_company/memory/retrieval_config.py(config for fusion strategy)Research