Skip to content

feat(memory): SQLite metadata, BM25 fallback, context-manager injection#254

Merged
haofeif merged 3 commits into
mainfrom
feat/memory-phase2-sqlite
May 28, 2026
Merged

feat(memory): SQLite metadata, BM25 fallback, context-manager injection#254
haofeif merged 3 commits into
mainfrom
feat/memory-phase2-sqlite

Conversation

@fanhongy

Copy link
Copy Markdown
Contributor

Summary

PR #2 in the 8-PR memory-system split. Builds on Phase 1 (PR #245) to add the headline trio:

  • U1 — SQLite metadata mirror. MemoryMetadataModel with UniqueConstraint(key, scope, scope_id) + indexes on (scope, scope_id), updated_at, memory_type. Symmetric upsert (every field set on insert is also updated). Wiki files remain the source of truth; SQLite mirrors metadata for fast filtered lookup. Mirror failures log a warning but never block stores.
  • U6 — BM25 fallback search. search_mode: "metadata" | "bm25" | "hybrid", strict validation (raises ValueError). Graceful import fallback when rank_bm25 is unavailable. Token-set intersection gates ranking (BM25 IDF can go negative on tiny corpora — score>0 alone isn't safe).
  • U9 — Curated injection. memory_manager agent profile + --memory launch flag spawns a sidecar terminal in the same tmux session. get_curated_memory_context is session-scoped, per-curator-locked, polls with time.sleep(0.5), and falls back to deterministic Phase 1 on any failure (no manager, busy, timeout, parse error). Sidecar creation runs as a FastAPI BackgroundTask so /sessions returns immediately.

Copilot pre-flight

Three cleanup rounds preemptively addressed bug classes Copilot has flagged on prior memory PRs:

  • requests.* timeouts (MCP_REQUEST_TIMEOUT=30) on every call site in launch.py + mcp_server/server.py
  • _memory_injected_terminals race fixed with threading.Lock
  • _curator_locks registry cleaned up in delete_terminal
  • Sidecar spawn moved off the request path
  • tz-aware datetime.now(timezone.utc), no utcnow()
  • MCP envelope {"success": ...} consistent across memory_store/recall/forget
  • No silent fallback on invalid search_mode
  • No mocking of the SUT in tests (mocks sys.modules['rank_bm25'] only)

Test plan

  • uv run pytest test/ --ignore=test/e2e --ignore=test/providers/test_q_cli_integration.py --ignore=test/providers/test_kiro_cli_integration.py — 1759 passed, 1 skipped
  • uv run black --check src/ test/ && uv run isort --check-only src/ test/ — clean
  • Verified all requests.* calls in launch.py + mcp_server/server.py carry timeout=
  • Verified BM25 path returns results for matching queries (token-set gate)
  • Verified curator falls back cleanly when memory_manager terminal is absent / busy / produces no envelope

🤖 Generated with Claude Code

@fanhongy fanhongy changed the title feat(memory): SQLite metadata, BM25 fallback, context-manager injection (Phase 2) feat(memory): SQLite metadata, BM25 fallback, context-manager injection May 22, 2026
@haofeif haofeif requested a review from Copilot May 22, 2026 12:18
@haofeif haofeif added the enhancement New feature or request label May 22, 2026

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds Phase 2 capabilities to the CAO memory system by mirroring memory metadata into SQLite, introducing BM25-based content search (with graceful fallback), and enabling curated memory injection via a memory_manager sidecar terminal (plus HTTP request timeouts).

Changes:

  • Introduces MemoryMetadataModel and DB migrations/indexes to mirror wiki memory metadata into SQLite.
  • Adds search_mode (metadata/bm25/hybrid) to memory recall and integrates BM25 fallback search.
  • Adds curated first-message memory injection via a memory_manager sidecar launched with --memory / memory_manager=true, and moves sidecar spawn off the request path via FastAPI background tasks.

Reviewed changes

Copilot reviewed 15 out of 17 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
uv.lock Locks rank-bm25 (and transitive deps) for BM25 search support.
pyproject.toml Adds rank-bm25 as a dependency.
.gitignore Ignores aidlc-docs output directory.
src/cli_agent_orchestrator/constants.py Adds MCP_REQUEST_TIMEOUT constant for HTTP calls.
src/cli_agent_orchestrator/clients/database.py Adds MemoryMetadataModel + migration function to create metadata indexes.
src/cli_agent_orchestrator/services/memory_service.py Mirrors store/forget metadata to SQLite; adds BM25 + search_mode; adds context-manager curated injection path.
src/cli_agent_orchestrator/services/terminal_service.py Makes first-message injection thread-safe and switches injection to curated context.
src/cli_agent_orchestrator/mcp_server/server.py Adds request timeouts and passes search_mode through memory_recall.
src/cli_agent_orchestrator/cli/commands/launch.py Adds --memory flag and applies request timeouts.
src/cli_agent_orchestrator/api/main.py Spawns memory_manager sidecar via BackgroundTasks when requested.
src/cli_agent_orchestrator/agent_store/memory_manager.md New agent profile for curated memory injection.
test/services/test_memory_service_phase2.py Phase 2 tests covering SQLite mirroring, hybrid recall, and curated workflows.
test/services/test_bm25_search.py Tests BM25 search modes, validation, and graceful fallback.
test/services/test_context_manager_agent.py Tests agent profile presence, launch flag wiring, and curated/fallback injection behavior.
test/providers/test_memory_injection.py Updates injection tests to use curated injection method.
test/mcp_server/test_load_skill.py Updates skill-load test for request timeout.
test/mcp_server/test_assign.py Updates assign tests to include MCP_REQUEST_TIMEOUT.
Comments suppressed due to low confidence (1)

src/cli_agent_orchestrator/services/memory_service.py:635

  • Phase 2 introduces a SQLite metadata mirror, but _metadata_recall() still walks index.md and reads each wiki file for filtering. This defeats the stated goal of fast filtered lookup via SQLite and also contradicts the database model docstring that SQLite is the source of truth for metadata queries. Consider querying MemoryMetadataModel first (scope/scope_id/memory_type/key/tags/updated_at) to select candidate file_paths and only then loading wiki bodies as needed for content matching.
    async def _metadata_recall(
        self,
        query: Optional[str] = None,
        scope: Optional[str] = None,
        memory_type: Optional[str] = None,
        limit: int = 10,
        terminal_context: Optional[dict] = None,
        scan_all: bool = False,
    ) -> list[Memory]:
        """Substring-match recall via index.md walk (Phase 1 path)."""
        results: list[Memory] = []

        # Determine which project dirs to search
        search_dirs = self._get_search_dirs(scope, terminal_context, scan_all=scan_all)


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 63 to 71
@@ -68,14 +70,14 @@ def inject_memory_context(first_message: str, terminal_id: str) -> str:
a formatted <cao-memory>...</cao-memory> block (or empty string if
no memories exist). Stateless — no file mutation, no backup/restore.
Comment on lines +1149 to +1154
# Poll up to ~15s for the curator to finish responding. Without
# the sleep this loop spins in microseconds and we always read
# stale output.
for _ in range(30):
if provider.get_status() in (
TerminalStatus.COMPLETED,
Comment thread src/cli_agent_orchestrator/clients/database.py Outdated
Comment on lines +316 to +317
def test_direct_db_insert_reflected_in_index(self, svc, db_engine, tmp_path):
"""Insert directly to DB, then store via service — index should be consistent."""
Comment on lines +393 to +397
start = time.time()
results = _run(svc.recall(query="banana50", scope="global", terminal_context=ctx))
elapsed = time.time() - start

assert elapsed < 0.5, f"BM25 search took {elapsed:.3f}s, exceeds 500ms budget"
…on (Phase 2)

Adds U1 (SQLite metadata mirror with symmetric upsert), U6 (BM25 ranked
search with token-set match gate), and U9 (memory_manager sidecar for
curated injection, falls back to Phase 1 on any failure). Sidecar
spawn runs as a BackgroundTask so /sessions returns immediately.
@fanhongy fanhongy force-pushed the feat/memory-phase2-sqlite branch from 7f9bf20 to e7bbffe Compare May 24, 2026 08:40
accept the suggestion

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
@fanhongy

Copy link
Copy Markdown
Contributor Author

no high issue, should be good to merge.

…lite

# Conflicts:
#	src/cli_agent_orchestrator/api/main.py
#	src/cli_agent_orchestrator/cli/commands/launch.py

@haofeif haofeif left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approve

@haofeif haofeif merged commit 775af0c into main May 28, 2026
33 checks passed
@haofeif haofeif deleted the feat/memory-phase2-sqlite branch May 28, 2026 11:53
call-me-ram added a commit to call-me-ram/cli-agent-orchestrator that referenced this pull request Jun 3, 2026
Bring the event-driven architecture branch up to date with main (98
commits) and reconcile the rewrite with features that landed after it
forked: eager inbox delivery (awslabs#251), the OpenCode poller, env-var
forwarding (awslabs#259), memory curation (awslabs#254/awslabs#262), CORS auto-derive (awslabs#261),
DNS host validation (awslabs#124), and the self-send guard (awslabs#24).

Highlights:
- Providers adopt the async initialize() + get_status(buffer) contract;
  copilot_cli/opencode_cli converted; kiro keeps colour-only ANSI
  stripping so carriage-return-redraw permission prompts aren't misread
  as idle.
- Event-driven InboxService.deliver_pending with the awslabs#251 eager gate and
  message-sender attribution; OpenCode poller retained as a status-driven
  method; the watchdog (PollingObserver/LogFileHandler) is removed.
- terminal_service.create_terminal is async (FIFO + StatusMonitor wiring);
  session_service.create_session, flow_service.execute_flow, the API
  endpoints, and `cao flow run` updated to await.
- memory_service curated path and the flow CLI fixed to the new contract.

Full unit suite green (1908 passed); black + isort clean.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants