Skip to content

feat(retrieve): add RetrievalObserver for retrieval quality metrics#622

Merged
MaojiaSheng merged 1 commit intovolcengine:mainfrom
mvanhorn:osc/feat-retrieval-observer
Mar 15, 2026
Merged

feat(retrieve): add RetrievalObserver for retrieval quality metrics#622
MaojiaSheng merged 1 commit intovolcengine:mainfrom
mvanhorn:osc/feat-retrieval-observer

Conversation

@mvanhorn
Copy link
Copy Markdown
Contributor

Summary

Add retrieval quality observability following the existing observer pattern. A new RetrievalObserver tracks query metrics (result counts, scores, latency, rerank usage) and reports health via the observer API.

Problem Statement

The observer infrastructure covers queue, VikingDB, VLM, and transaction subsystems, but the retrieval pipeline has no observability beyond debug logs. When retrieval quality degrades, users have no metrics to diagnose the problem. The HierarchicalRetriever logs individual queries at DEBUG level, but there are no aggregate stats for hit rates, score distributions, or latency.

Evidence

Source Evidence Engagement
#578 Users report poor retrieval results, no way to measure quality 1 thumbs up, 3 comments
Codebase 5 observers exist (queue, vikingdb, vlm, transaction, base) but none for retrieval -
RFC: Trace Tracing RFC covers per-request traces, not aggregate quality metrics -

The observer pattern is well-established in the codebase. This PR follows it exactly.

Changes

New files:

  • openviking/retrieve/retrieval_stats.py (165 lines) - Thread-safe RetrievalStatsCollector singleton that accumulates per-query metrics
  • openviking/storage/observers/retrieval_observer.py (99 lines) - RetrievalObserver implementing BaseObserver
  • tests/unit/retrieve/test_retrieval_stats.py (175 lines) - 17 unit tests

Modified files:

  • openviking/retrieve/hierarchical_retriever.py - Record stats after each retrieve() call (3 lines added)
  • openviking/storage/observers/__init__.py - Export RetrievalObserver
  • openviking/service/debug_service.py - Register retrieval in observer service + system health
  • openviking/server/routers/observer.py - Add GET /api/v1/observer/retrieval endpoint

Metrics tracked

Metric Description
Total Queries Cumulative retrieval count
Zero-Result Rate Fraction of queries returning no results
Avg/Min/Max Score Score distribution across results
Queries by Type Breakdown by context_type (memory, resource, skill)
Rerank Used/Fallback How often reranking is applied vs falls back
Avg/Max Latency End-to-end retrieval timing in ms

Health logic

  • Healthy: zero-result rate < 50%
  • Unhealthy: zero-result rate >= 50% (after at least 5 queries)
  • No errors flagged with fewer than 5 queries

Testing

All 17 unit tests pass:

======================== 17 passed, 1 warning in 0.03s =========================

The RetrievalObserver uses a lazy import of get_stats_collector() to avoid a circular dependency with the storage module. This follows the same pattern as tabulate imports in other observers.

This contribution was developed with AI assistance (Claude Code).

Relates to #578

Add retrieval quality observability following the existing observer
pattern (QueueObserver, VLMObserver, VikingDBObserver).

- RetrievalStatsCollector: thread-safe singleton that accumulates
  per-query metrics (result counts, scores, latency, rerank usage)
- RetrievalObserver: reads accumulated stats, reports health based on
  zero-result rate, formats status table with tabulate
- Instrumented HierarchicalRetriever.retrieve() to record stats
- Added /api/v1/observer/retrieval endpoint
- Included in system-wide observer health check
- 17 unit tests covering stats, collector, and observer

This contribution was developed with AI assistance (Claude Code).
@MaojiaSheng MaojiaSheng merged commit c44e614 into volcengine:main Mar 15, 2026
5 checks passed
@github-project-automation github-project-automation bot moved this from Backlog to Done in OpenViking project Mar 15, 2026
@mvanhorn
Copy link
Copy Markdown
Contributor Author

Thanks for reviewing and merging all of these, @MaojiaSheng - great working with you on OpenViking.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

2 participants