feat(convo_miner): auto-route AI tool sessions to wing_api#1236
Conversation
When mempalace mine --mode convos is invoked against a directory inside
a known AI-tool storage path (Claude Code, Codex CLI, Gemini CLI), the
destination wing now auto-defaults to wing_api rather than the directory
basename. Conversations from external API-keyed tools land grouped under
a single dedicated wing for visibility.
Detected paths (exact-segment match — substrings like .gemini-backup or
.codex-archive do NOT match):
- any segment .codex (Codex CLI sessions / archives)
- any segment .gemini (Gemini CLI sessions under ~/.gemini/tmp/...)
- the consecutive segment pair .claude/projects (Claude Code).
.claude alone is NOT matched - that is the settings/config dir,
not a conversation source.
Wing-resolution precedence (first match wins):
1. Explicit --wing argument from the user - always wins
2. AI-tool path detection -> wing_api
3. Basename fallback (existing behavior, unchanged)
Two new helpers split out of mine_convos for unit-test coverage:
- _is_ai_tool_path(path: Path) -> bool
- _resolve_wing(convo_path: Path, wing: Optional[str]) -> str
mine_convos now calls _resolve_wing in place of its inline basename
logic. No other call sites or downstream consumers change.
Test coverage:
- 15 unit tests covering positive matches (Claude Code subdir + root,
Codex root + sessions, Gemini root + chats), negative cases
(.claude alone is settings dir, unrelated paths, substring no-match
on .gemini-backup / .codex-archive), explicit --wing override,
auto-route trio, basename fallback, empty-string-as-no-wing.
- End-to-end smoke test (manual): real-shape Claude Code JSONL fixture
mined via the actual CLI; sqlite read-back of /tmp palace confirms
drawers landed with wing='wing_api' and verbatim content preserved;
mempalace search --wing wing_api returns expected content ranked.
- Full pytest sweep: 1388 baseline + 15 new = 1403 passed, zero
regressions.
Design context:
This change reflects Aya's product call that conversations from
API-keyed AI tools should land in a structural wing_api rather than be
scattered across topical wings derived from directory basenames. Igor's
ADR-0017 in mempalace-ts proposes the alternative of source-prefix
metadata (source LIKE 'api/%') with topical wing assignment instead;
that approach has architectural merit (wings stay topical) but does not
deliver the single-wing visibility users get here. Open for review
discussion - explicit --wing flag and basename fallback both unchanged,
so this is additive and reversible.
Closes part of #59 for the auto-routing UX.
c2f5d71 to
4098c54
Compare
bensig
left a comment
There was a problem hiding this comment.
Approve. Clean separation of concerns, correct path-matching, full test coverage, and the wing-resolution precedence is exactly right.
Full pytest on this branch: 1456 passed, 1 skipped, 19 in `test_convo_miner.py` (matches the +15 new the body promised).
What I checked
Path matching is right and defensive
- `path.resolve().parts` handles symlinks and relative paths correctly. A user who symlinks a Claude transcript dir somewhere else still gets routed to `wing_api` because resolve surfaces the original `.claude/projects/` path.
- `try/except (OSError, RuntimeError)` around `resolve()` catches the rare case of a broken symlink or path-too-long without crashing the mine.
- Exact-segment match is the key correctness detail. `.gemini-backup` and `.codex-archive` correctly do NOT match — that would have been an easy mistake. The negative tests cover both.
- `.claude/projects` requires the consecutive-segment pair, not bare `.claude` (which is the settings dir, not conversations). Also tested.
Wing-resolution precedence is correct
- Explicit `--wing` always wins. User intent sacrosanct.
- AI-tool path → `wing_api` when no explicit wing.
- `normalize_wing_name(basename)` fallback uses the shared helper from `config.py` — same source of truth as `cmd_init`, `room_detector_local`, and `miner.load_config`. Slots cleanly into the #1194 consolidation work that landed yesterday.
The empty-string handling (`if wing:` is falsy on `""`) matches the #1097 "empty-string as no filter" pattern that's now consistent across the codebase.
Tests cover the right surface
The 15 new test cases hit:
- Positive matches: Claude Code subdir + root, Codex root + sessions, Gemini root + chats
- Negative cases: `.claude` alone (settings, NOT conversations), unrelated paths, substring no-match on `.gemini-backup` / `.codex-archive`
- Override paths: explicit `--wing` beats auto-route, basename fallback on non-AI paths, empty-string treated as no-wing
The negative cases are the ones that prove the matching is exact rather than fuzzy. Good discipline.
Architecturally sound
Routing API-driven conversations to a dedicated `wing_api` (separate from project wings) is the right default. They're a different kind of content — general LLM exchanges that can span any topic — so segregating them into one wing makes search and graph traversal cleaner. Users who want them in a specific project wing pass `--wing`; users who do nothing get something semantically reasonable.
Minor observations (not blockers)
-
`wing_api` is hardcoded in `_resolve_wing`. Could be promoted to a module-level constant (`_AI_TOOL_DEFAULT_WING = "wing_api"`) for visibility, but not material.
-
The detection list is closed. As more AI-tool ecosystems land (Cursor, Continue, Aider, Zed AI, etc.), this set needs extension. Could become config-driven later (env var or `config.json` key like `ai_tool_path_segments`). Out of scope for this PR; worth a follow-up issue if the list grows.
-
Edge case worth knowing: if a user mines `~/.claude/projects/-Users-me-Projects-MyProject/` and wants those Claude conversations specifically in `wing_myproject`, they need `--wing myproject`. The default (`wing_api`) is more useful for the majority case where Claude conversations are general-purpose, but worth a doc note that "explicit per-project routing of AI-tool conversations is one flag away."
Closes part of #59. Ship it.
|
approved. |
Bumps version 3.3.5 → 3.3.6 across pyproject.toml, version.py, plugin manifests (.claude-plugin/plugin.json, .claude-plugin/marketplace.json, .codex-plugin/plugin.json), README badge, and uv.lock. Flips CHANGELOG.md from ``[Unreleased]`` to ``[3.3.6] — 2026-05-24`` and backfills the major user-facing entries that landed without changelog entries during the cycle: Features: - MemPalace#1555 office-document mining via --mode extract + virtual line numbers - MemPalace#1584 surgical closet pointers with date+line locators (Tier 6a) - MemPalace#1558 + MemPalace#1560 within-wing hallways (entity co-occurrence graph) - MemPalace#1565 cross-wing tunnels auto-promoted from hallways - MemPalace#1578 Hebbian potentiation + Ebbinghaus decay on hallways/tunnels - MemPalace#1236 API-tool transcripts auto-route to wing_api - MemPalace#711 hooks.auto_save toggle for silent-mode sessions - MemPalace#1605 COCA content-word filter for entity detection - MemPalace#1557 case-insensitive entity matching at mine time - MemPalace#1483 multilingual embeddings (embeddinggemma-300m) by default Bug Fixes (selected, user-visible): - MemPalace#1540 silent data loss in three unchunked upsert sites - MemPalace#1538 paragraph chunker oversized chunks - MemPalace#1554 per-file chunk cap too low for transcripts - MemPalace#1562 Windows hook subprocess/ChromaDB deadlock - MemPalace#1529 create_tunnel corrupted hyphenated wing names - MemPalace#1424 save-hook truncated hyphenated project folders - MemPalace#1383 KG cache duplicated graphs for symlinked/cased paths - MemPalace#1466 silent symlink skip now logged - MemPalace#1441 macOS stock-bash 3.2 hook compatibility - MemPalace#1500 / MemPalace#1513 structured JSON-RPC errors on bad MCP input - MemPalace#1523 VACUUM + FTS5 rebuild after repair - MemPalace#1548 FTS5 validation at end of mine - plus MemPalace#1216, MemPalace#1408, MemPalace#1438, MemPalace#1439, MemPalace#1445, MemPalace#1452, MemPalace#1459, MemPalace#1461, MemPalace#1466, MemPalace#1470, MemPalace#1477, MemPalace#1485, MemPalace#1500, MemPalace#1513, MemPalace#1528, MemPalace#1532, MemPalace#1543, MemPalace#1546, MemPalace#1585 Performance: - MemPalace#1474 convo miner pre-fetches mined-set - MemPalace#1487 rebuild_index progress callback - MemPalace#1530 MCP cold-start diagnostics + opt-in warmup Lint passes (ruff 0.15.14); mempalace-mcp entry point alignment verified per RELEASING.md.

When mempalace mine --mode convos is invoked against a directory inside a known AI-tool storage path (Claude Code, Codex CLI, Gemini CLI), the destination wing now auto-defaults to wing_api rather than the directory basename. Conversations from external API-keyed tools land grouped under a single dedicated wing for visibility.
Detected paths (exact-segment match — substrings like .gemini-backup or .codex-archive do NOT match):
Wing-resolution precedence (first match wins):
Two new helpers split out of mine_convos for unit-test coverage:
mine_convos now calls _resolve_wing in place of its inline basename logic. No other call sites or downstream consumers change.
Test coverage:
Closes part of #59 for the auto-routing UX.