feat(memory): log memory retrieval failures for OmniMem self-improvement loop#3597
Merged
feat(memory): log memory retrieval failures for OmniMem self-improvement loop#3597
Conversation
…ent loop Add `RetrievalFailureLogger` to `zeph-memory` — an async fire-and-forget subsystem that records no-hit turns, low-confidence recalls, timeouts, and errors into a new `memory_retrieval_failures` SQLite table (migration 083). The write path uses a bounded mpsc channel (256 cap) with a background batch writer (16 records / 100 ms flush). `try_send` on the hot path adds zero latency to the retrieval critical path (INV-1 satisfied). Integration points: - `fetch_graph_facts_raw`: logs no-hit and all error paths (Bfs/AStar/ WaterCircles/BeamSearch/Hybrid) before propagating errors (B1 fix) - `fetch_semantic_recall_raw`: logs no-hit and low-confidence paths Config via `[memory.retrieval_failures]`: - `enabled` (default `false`) — privacy-safe opt-in - `low_confidence_threshold` (default `0.3`) - `retention_days` (default `90`) with automatic cleanup every 500 flushes `RetrievalFailureLogger` wraps `tx` and `handle` in `Option<_>` so `shutdown()` drains the channel cleanly and `Drop` aborts the background task if shutdown is skipped. The `flush_batch` span uses `.instrument()` to capture actual SQLite INSERT latency in traces. This provides the minimum viable failure dataset required by the OmniMem self-improvement loop (arXiv:2604.01007) for future SYNAPSE parameter tuning. Closes #3576
97e1e1e to
5393f2c
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
RetrievalFailureLogger— async fire-and-forget subsystem that records memory retrieval failures (no-hit, low-confidence, timeout, error) into a newmemory_retrieval_failuresSQLite table (migration 083)fetch_graph_facts_rawandfetch_semantic_recall_rawinzeph-agent-contextDesign
try_sendon hot path — zero latency impactno_hit,low_confidence,timeout,error[memory.retrieval_failures]):enabled = falseby default (privacy-safe opt-in),low_confidence_threshold = 0.3,retention_days = 90retention_daysevery 500 flushestx/handlewrapped inOption<_>;shutdown()drains cleanly;Dropaborts background taskTest plan
no_hit_failure_is_persisted,low_confidence_failure_is_persisted,log_does_not_block_when_channel_is_full,query_text_truncated_to_512_chars,logger_disabled_when_option_is_none,multiple_records_batch_flushed[memory.retrieval_failures] enabled = trueand verify >100 rows after standard test session (see playbook.local/testing/playbooks/memory-retrieval-failures.md)References
/specs/004-memory/004-7-memory-apex-magma.md§14.2