feat(memory): SQLite metadata, BM25 fallback, context-manager injection#254
Merged
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
Adds Phase 2 capabilities to the CAO memory system by mirroring memory metadata into SQLite, introducing BM25-based content search (with graceful fallback), and enabling curated memory injection via a memory_manager sidecar terminal (plus HTTP request timeouts).
Changes:
- Introduces
MemoryMetadataModeland DB migrations/indexes to mirror wiki memory metadata into SQLite. - Adds
search_mode(metadata/bm25/hybrid) to memory recall and integrates BM25 fallback search. - Adds curated first-message memory injection via a
memory_managersidecar launched with--memory/memory_manager=true, and moves sidecar spawn off the request path via FastAPI background tasks.
Reviewed changes
Copilot reviewed 15 out of 17 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
uv.lock |
Locks rank-bm25 (and transitive deps) for BM25 search support. |
pyproject.toml |
Adds rank-bm25 as a dependency. |
.gitignore |
Ignores aidlc-docs output directory. |
src/cli_agent_orchestrator/constants.py |
Adds MCP_REQUEST_TIMEOUT constant for HTTP calls. |
src/cli_agent_orchestrator/clients/database.py |
Adds MemoryMetadataModel + migration function to create metadata indexes. |
src/cli_agent_orchestrator/services/memory_service.py |
Mirrors store/forget metadata to SQLite; adds BM25 + search_mode; adds context-manager curated injection path. |
src/cli_agent_orchestrator/services/terminal_service.py |
Makes first-message injection thread-safe and switches injection to curated context. |
src/cli_agent_orchestrator/mcp_server/server.py |
Adds request timeouts and passes search_mode through memory_recall. |
src/cli_agent_orchestrator/cli/commands/launch.py |
Adds --memory flag and applies request timeouts. |
src/cli_agent_orchestrator/api/main.py |
Spawns memory_manager sidecar via BackgroundTasks when requested. |
src/cli_agent_orchestrator/agent_store/memory_manager.md |
New agent profile for curated memory injection. |
test/services/test_memory_service_phase2.py |
Phase 2 tests covering SQLite mirroring, hybrid recall, and curated workflows. |
test/services/test_bm25_search.py |
Tests BM25 search modes, validation, and graceful fallback. |
test/services/test_context_manager_agent.py |
Tests agent profile presence, launch flag wiring, and curated/fallback injection behavior. |
test/providers/test_memory_injection.py |
Updates injection tests to use curated injection method. |
test/mcp_server/test_load_skill.py |
Updates skill-load test for request timeout. |
test/mcp_server/test_assign.py |
Updates assign tests to include MCP_REQUEST_TIMEOUT. |
Comments suppressed due to low confidence (1)
src/cli_agent_orchestrator/services/memory_service.py:635
- Phase 2 introduces a SQLite metadata mirror, but _metadata_recall() still walks index.md and reads each wiki file for filtering. This defeats the stated goal of fast filtered lookup via SQLite and also contradicts the database model docstring that SQLite is the source of truth for metadata queries. Consider querying MemoryMetadataModel first (scope/scope_id/memory_type/key/tags/updated_at) to select candidate file_paths and only then loading wiki bodies as needed for content matching.
async def _metadata_recall(
self,
query: Optional[str] = None,
scope: Optional[str] = None,
memory_type: Optional[str] = None,
limit: int = 10,
terminal_context: Optional[dict] = None,
scan_all: bool = False,
) -> list[Memory]:
"""Substring-match recall via index.md walk (Phase 1 path)."""
results: list[Memory] = []
# Determine which project dirs to search
search_dirs = self._get_search_dirs(scope, terminal_context, scan_all=scan_all)
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
63
to
71
| @@ -68,14 +70,14 @@ def inject_memory_context(first_message: str, terminal_id: str) -> str: | |||
| a formatted <cao-memory>...</cao-memory> block (or empty string if | |||
| no memories exist). Stateless — no file mutation, no backup/restore. | |||
Comment on lines
+1149
to
+1154
| # Poll up to ~15s for the curator to finish responding. Without | ||
| # the sleep this loop spins in microseconds and we always read | ||
| # stale output. | ||
| for _ in range(30): | ||
| if provider.get_status() in ( | ||
| TerminalStatus.COMPLETED, |
Comment on lines
+316
to
+317
| def test_direct_db_insert_reflected_in_index(self, svc, db_engine, tmp_path): | ||
| """Insert directly to DB, then store via service — index should be consistent.""" |
Comment on lines
+393
to
+397
| start = time.time() | ||
| results = _run(svc.recall(query="banana50", scope="global", terminal_context=ctx)) | ||
| elapsed = time.time() - start | ||
|
|
||
| assert elapsed < 0.5, f"BM25 search took {elapsed:.3f}s, exceeds 500ms budget" |
…on (Phase 2) Adds U1 (SQLite metadata mirror with symmetric upsert), U6 (BM25 ranked search with token-set match gate), and U9 (memory_manager sidecar for curated injection, falls back to Phase 1 on any failure). Sidecar spawn runs as a BackgroundTask so /sessions returns immediately.
7f9bf20 to
e7bbffe
Compare
accept the suggestion Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Contributor
Author
|
no |
…lite # Conflicts: # src/cli_agent_orchestrator/api/main.py # src/cli_agent_orchestrator/cli/commands/launch.py
4 tasks
call-me-ram
added a commit
to call-me-ram/cli-agent-orchestrator
that referenced
this pull request
Jun 3, 2026
Bring the event-driven architecture branch up to date with main (98 commits) and reconcile the rewrite with features that landed after it forked: eager inbox delivery (awslabs#251), the OpenCode poller, env-var forwarding (awslabs#259), memory curation (awslabs#254/awslabs#262), CORS auto-derive (awslabs#261), DNS host validation (awslabs#124), and the self-send guard (awslabs#24). Highlights: - Providers adopt the async initialize() + get_status(buffer) contract; copilot_cli/opencode_cli converted; kiro keeps colour-only ANSI stripping so carriage-return-redraw permission prompts aren't misread as idle. - Event-driven InboxService.deliver_pending with the awslabs#251 eager gate and message-sender attribution; OpenCode poller retained as a status-driven method; the watchdog (PollingObserver/LogFileHandler) is removed. - terminal_service.create_terminal is async (FIFO + StatusMonitor wiring); session_service.create_session, flow_service.execute_flow, the API endpoints, and `cao flow run` updated to await. - memory_service curated path and the flow CLI fixed to the new contract. Full unit suite green (1908 passed); black + isort clean.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
PR #2 in the 8-PR memory-system split. Builds on Phase 1 (PR #245) to add the headline trio:
MemoryMetadataModelwithUniqueConstraint(key, scope, scope_id)+ indexes on(scope, scope_id),updated_at,memory_type. Symmetric upsert (every field set on insert is also updated). Wiki files remain the source of truth; SQLite mirrors metadata for fast filtered lookup. Mirror failures log a warning but never block stores.search_mode: "metadata" | "bm25" | "hybrid", strict validation (raisesValueError). Graceful import fallback whenrank_bm25is unavailable. Token-set intersection gates ranking (BM25 IDF can go negative on tiny corpora — score>0 alone isn't safe).memory_manageragent profile +--memorylaunch flag spawns a sidecar terminal in the same tmux session.get_curated_memory_contextis session-scoped, per-curator-locked, polls withtime.sleep(0.5), and falls back to deterministic Phase 1 on any failure (no manager, busy, timeout, parse error). Sidecar creation runs as a FastAPIBackgroundTaskso/sessionsreturns immediately.Copilot pre-flight
Three cleanup rounds preemptively addressed bug classes Copilot has flagged on prior memory PRs:
requests.*timeouts (MCP_REQUEST_TIMEOUT=30) on every call site inlaunch.py+mcp_server/server.py_memory_injected_terminalsrace fixed withthreading.Lock_curator_locksregistry cleaned up indelete_terminaldatetime.now(timezone.utc), noutcnow(){"success": ...}consistent acrossmemory_store/recall/forgetsearch_modesys.modules['rank_bm25']only)Test plan
uv run pytest test/ --ignore=test/e2e --ignore=test/providers/test_q_cli_integration.py --ignore=test/providers/test_kiro_cli_integration.py— 1759 passed, 1 skippeduv run black --check src/ test/ && uv run isort --check-only src/ test/— cleanrequests.*calls inlaunch.py+mcp_server/server.pycarrytimeout=🤖 Generated with Claude Code