Skip to content

Commit 7ba28dc

Browse files
committed
refactor(searcher): retire kind= filter — structural split made it inert
The `kind=` post-filter and the `max(n*20, 100)` over-fetch hack were transitional safety nets put in place 2026-04-25 when the main collection still carried Stop-hook auto-save checkpoint drawers that dominated vector top-N. Phases A–E of the checkpoint collection split (2026-04-25 → 2026-04-26) moved all checkpoints to `mempalace_session_recovery`; empirical check on the canonical 150,891-drawer production palace finds 0 drawers with `topic=checkpoint` and 0 with `topic=auto-save` in the main collection (763 in `mempalace_session_recovery`). The filter was filtering nothing. Deletes: - `_CHECKPOINT_TOPICS` from `searcher.py` (moved to `palace.py` next to `_SESSION_RECOVERY_COLLECTION` — write-side routing in `tool_diary_write` still needs it; read-side does not) - `_is_checkpoint_drawer` and `_apply_kind_text_filter` post-filter - `kind=` parameter on `search_memories`, `build_where_filter`, `search` (CLI delegate), and `mempalace_search` MCP tool (input_schema entry removed) - The `max(n*20, 100)` over-fetch hack — back to standard `n_results * 3` - `TestCheckpointFilter` (9 tests) in `tests/test_searcher.py` `migrate.py` now imports `_CHECKPOINT_TOPICS` from `palace.py` instead of carrying its own duplicate. `layers.py` calls drop their now-unused `kind="all"` argument. Tests: 1500 passing (was 1510 — TestCheckpointFilter's 9 + 1 misc). Companion change in palace-daemon strips `kind=` from `/search` and `/context` HTTP routes. Production verified 0 checkpoints in main before deletion.
1 parent b4cf56c commit 7ba28dc

8 files changed

Lines changed: 27 additions & 319 deletions

File tree

CLAUDE.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -50,10 +50,10 @@ Ruff for linting (`ruff check`), line length 100, target Python 3.9.
5050
18. ~~**fix: PID file guard prevents stacking mine processes**~~**merged upstream via #1023 in v3.3.2.** Includes the Windows `os.kill``OpenProcess` cross-platform fix. No longer fork-ahead.
5151
19. **fix: `.claude-plugin/` venv-aware Python resolution** — hooks (`mempal-stop-hook.sh`, `mempal-precompact-hook.sh`) and `.mcp.json` resolve Python in this order: `MEMPALACE_PYTHON` env → `$PLUGIN_ROOT/venv/bin/python3` → system `python3`. Upstream's `5fe0c1c` + `be9214a` (fatkobra) and `9f5b8f5` (Pim) regressed to PATH-only lookups and bare `"mempalace-mcp"` command, which break editable dev installs where `mempalace`/`mempalace-mcp` only live in the repo venv. Documented here so future `upstream/develop` merges surface the conflict rather than silently re-regress. Attempted via #1115 on 2026-04-22; withdrew 2026-04-23 as premature pending #1069 arbitration — CI correctly caught the #942 PATH-only contract violation. Re-submit after bensig's direction on #1069.
5252
20. ~~**fix: `_tokenize` None-document guard**~~**merged upstream via #1198 on 2026-04-26.** No longer fork-ahead.
53-
21. **feat: `kind` filter on `search_memories` excludes Stop-hook checkpoints by default** (commits `8d02835` → `3d85739` → `398f42f` → `f9f5cc4`, 2026-04-25) — Stop-hook auto-save diary entries (topic=checkpoint, text starting `"CHECKPOINT:"`) were dominating MCP search results because they're short, word-dense, and outrank substantive content under cosine similarity. New `kind` parameter on `search_memories` and `mempalace_search` MCP tool: `"content"` (default, excludes checkpoints), `"checkpoint"` (only checkpoints, recovery/audit), `"all"` (no filter, pre-2026-04-25 behavior). **Two architecture corrections during the same day:** (a) the where-clause filter (`topic $nin [...]`) tripped a ChromaDB 1.5.x filter-planner bug — `Internal error: Error finding id` on every kind=content vector query — so the exclusion moved to post-filter only (`398f42f`); (b) vector top-N is dominated by checkpoints on this palace (top-10 hits all CHECKPOINT entries on probe queries), so post-filter alone empties the result set without aggressive over-fetch — pull size raised to `max(n*20, 100)` for kind != "all" (`f9f5cc4`). Post-filter checks both `topic` metadata and text-prefix shape; coverage equivalent to the original belt-and-suspenders without the chromadb bug. Result dicts now surface `topic`. 9 tests in `TestCheckpointFilter`. Companion fix in [`jphein/palace-daemon`](https://github.com/jphein/palace-daemon) commit `dd8894c` standardizes all hook clients on `topic="checkpoint"` (was `topic="auto-save"` in `clients/hook.py`). Structural fix still pending: stop indexing checkpoints as searchable drawers (separate session-recovery table). Upstream PR pending.
53+
21. ~~**feat: `kind` filter on `search_memories` excludes Stop-hook checkpoints by default**~~**deleted 2026-04-27 as transitional/inert.** The structural split (Phases A–E, see row 23) moved all checkpoints to `mempalace_session_recovery`; production has 0 checkpoints in `mempalace_drawers`, so the filter was filtering nothing. Removed `_CHECKPOINT_TOPICS` from `searcher.py`, `_is_checkpoint_drawer`, `_apply_kind_text_filter`, the `max(n*20, 100)` over-fetch hack (back to `n_results * 3`), and the `kind=` parameter on `search_memories` / `mempalace_search` / palace-daemon `/search` & `/context`. Write-side `_CHECKPOINT_TOPICS` (topic→collection routing in `tool_diary_write`) lives in `palace.py` now alongside `_SESSION_RECOVERY_COLLECTION`. `TestCheckpointFilter` (9 tests) deleted.
5454
22. ~~**fix: `palace_graph.build_graph` skips None metadata**~~**merged upstream via #1201 on 2026-04-26.** No longer fork-ahead.
5555

56-
23. **feat: checkpoint collection split — phases A–C** (commit `e266365`, 2026-04-25) — Promoted from "future work" to "necessary" by 2026-04-25 Cat 9 A/B (`kind=all` 632 tokens/Q vs `kind=content` 3 tokens/Q on the canonical 151K palace; over-fetch=100 inadequate, structural fix non-optional). **Phase A:** new `_SESSION_RECOVERY_COLLECTION` constant + `get_session_recovery_collection()` in `palace.py` (mirrors `get_collection`'s shape — cosine, num_threads=1). **Phase B:** `tool_diary_write` routes `topic in _CHECKPOINT_TOPICS` to the dedicated `mempalace_session_recovery` collection, everything else stays in `mempalace_drawers`; new `_get_session_recovery_collection()` in `mcp_server.py` with parallel cache. **Phase C:** new `tool_session_recovery_read` MCP handler reads recovery collection only with optional filters `session_id`, `agent`, `since`, `until`, `wing`, `limit`; `session_id` added as optional metadata field on `tool_diary_write` so the new tool can filter by Claude Code session. Registered in `TOOLS` dict, documented in `website/reference/mcp-tools.md`. 12 new tests across `tests/test_session_recovery.py` + `TestCheckpointRouting` + `TestSessionRecoveryRead`. Design + plan at `docs/superpowers/specs/2026-04-25-checkpoint-collection-split.md` and `docs/superpowers/plans/2026-04-25-checkpoint-collection-split-impl.md`. **Phases D (data migration of ~640 existing checkpoints out of main collection) and E (palace-daemon `lifespan` auto-migrate + `mempalace repair --mode reorganize`) deferred** — multi-day work, gated on a separate go-ahead. Once D lands and the canonical-palace re-run shows the predicted `kind=all` ≈ `kind=content` token convergence, the `kind=` post-filter and over-fetch hack become deletable. **Update 2026-04-26:** phase D shipped — `migrate_checkpoints_to_recovery()` in `mempalace/migrate.py`, idempotent walk that moves topic in `_CHECKPOINT_TOPICS` drawers from main → recovery while preserving IDs and metadata. Wired into `mempalace repair --mode reorganize` (CLI dispatch in `cli.py` chooses between `rebuild` (HNSW from sqlite) and `reorganize` (this new path)). PreCompact hook also incorporated — `hook_precompact` now writes a recovery marker via `_save_diary_direct` mirroring Stop, so a context-compaction event leaves a queryable timestamp in the recovery collection. 6 new migration tests in `test_migrate.py::TestMigrateCheckpointsToRecovery`. **Phase E shipped** in palace-daemon commit [`034023c`](https://github.com/jphein/palace-daemon/commit/034023c) on 2026-04-26 — `lifespan` calls `migrate_checkpoints_to_recovery()` in an executor on startup, gated behind `PALACE_AUTO_MIGRATE_CHECKPOINTS=1` (default on), with `ImportError` fallthrough so upstream-shaped installs without `mempalace.migrate` still start cleanly. Canonical 151K palace migrated 667 checkpoints on 2026-04-26 10:24:09 PDT. **Cleanup phase pending** — once Cat 9 convergence (currently 974/1267 tokens/Q kind=all vs kind=content) is judged acceptable, delete `_CHECKPOINT_TOPICS`, `_apply_kind_text_filter`, the `max(n*20, 100)` over-fetch hack, and the `kind=` parameter on `search_memories` / `mempalace_search` / daemon `/search` & `/context` routes.
56+
23. **feat: checkpoint collection split — phases A–C** (commit `e266365`, 2026-04-25) — Promoted from "future work" to "necessary" by 2026-04-25 Cat 9 A/B (`kind=all` 632 tokens/Q vs `kind=content` 3 tokens/Q on the canonical 151K palace; over-fetch=100 inadequate, structural fix non-optional). **Phase A:** new `_SESSION_RECOVERY_COLLECTION` constant + `get_session_recovery_collection()` in `palace.py` (mirrors `get_collection`'s shape — cosine, num_threads=1). **Phase B:** `tool_diary_write` routes `topic in _CHECKPOINT_TOPICS` to the dedicated `mempalace_session_recovery` collection, everything else stays in `mempalace_drawers`; new `_get_session_recovery_collection()` in `mcp_server.py` with parallel cache. **Phase C:** new `tool_session_recovery_read` MCP handler reads recovery collection only with optional filters `session_id`, `agent`, `since`, `until`, `wing`, `limit`; `session_id` added as optional metadata field on `tool_diary_write` so the new tool can filter by Claude Code session. Registered in `TOOLS` dict, documented in `website/reference/mcp-tools.md`. 12 new tests across `tests/test_session_recovery.py` + `TestCheckpointRouting` + `TestSessionRecoveryRead`. Design + plan at `docs/superpowers/specs/2026-04-25-checkpoint-collection-split.md` and `docs/superpowers/plans/2026-04-25-checkpoint-collection-split-impl.md`. **Phases D (data migration of ~640 existing checkpoints out of main collection) and E (palace-daemon `lifespan` auto-migrate + `mempalace repair --mode reorganize`) deferred** — multi-day work, gated on a separate go-ahead. Once D lands and the canonical-palace re-run shows the predicted `kind=all` ≈ `kind=content` token convergence, the `kind=` post-filter and over-fetch hack become deletable. **Update 2026-04-26:** phase D shipped — `migrate_checkpoints_to_recovery()` in `mempalace/migrate.py`, idempotent walk that moves topic in `_CHECKPOINT_TOPICS` drawers from main → recovery while preserving IDs and metadata. Wired into `mempalace repair --mode reorganize` (CLI dispatch in `cli.py` chooses between `rebuild` (HNSW from sqlite) and `reorganize` (this new path)). PreCompact hook also incorporated — `hook_precompact` now writes a recovery marker via `_save_diary_direct` mirroring Stop, so a context-compaction event leaves a queryable timestamp in the recovery collection. 6 new migration tests in `test_migrate.py::TestMigrateCheckpointsToRecovery`. **Phase E shipped** in palace-daemon commit [`034023c`](https://github.com/jphein/palace-daemon/commit/034023c) on 2026-04-26 — `lifespan` calls `migrate_checkpoints_to_recovery()` in an executor on startup, gated behind `PALACE_AUTO_MIGRATE_CHECKPOINTS=1` (default on), with `ImportError` fallthrough so upstream-shaped installs without `mempalace.migrate` still start cleanly. Canonical 151K palace migrated 667 checkpoints on 2026-04-26 10:24:09 PDT. **Cleanup phase shipped 2026-04-27** — empirical check on production showed 0 checkpoints in `mempalace_drawers` (763 in `mempalace_session_recovery`), so the kind= filter was provably inert. Deleted in row 21 above.
5757

5858
27. **perf: batch ChromaDB inserts in miner (cherry-pick of upstream #1085)** (commit `6be6fff`, 2026-04-26) — Cherry-picked @midweste's [#1085](https://github.com/MemPalace/mempalace/pull/1085) "batch ChromaDB inserts in miner — 10-30x faster mining". Upstream PR #1085 is still **OPEN** as of 2026-04-26 (created 2026-04-21, base=develop, not yet merged) — verified via `gh pr view 1085 --repo MemPalace/mempalace`. We cherry-picked the commit ahead of merge so the fork can use it now; this row clears when #1085 merges into develop and we next sync. We don't file a competing fork-side PR — the proposal is @midweste's. New `_build_drawer()` helper builds id+document+metadata in one shot; new `add_drawers()` batch-insert function takes the full chunk list and sub-batches at `DRAWER_UPSERT_BATCH_SIZE` (one chromadb upsert + one ONNX embedding forward-pass per sub-batch instead of per-chunk). `process_file` now calls `add_drawers` directly. Hoists `datetime.now()` and `os.path.getmtime()` to file-level (2 syscalls per file instead of 2N). **Conflict resolution:** fork already had a fork-only `_build_drawer_metadata` + an outer batch loop in `process_file`; upstream's clean structure supersedes both. Kept fork's `DRAWER_UPSERT_BATCH_SIZE=1000` (more conservative than upstream's 5000 for embedding-pass memory headroom); aliased upstream's `CHROMA_BATCH_LIMIT` to point at it so any code/test referencing either name sees the same value. 74/74 miner+convo_miner tests pass; full suite 1366/1366. Becomes a no-op when #1085 merges into upstream develop and we next sync develop→main.
5959

README.md

Lines changed: 4 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88

99
---
1010

11-
This fork tracks `upstream/develop` through the 2026-04-27 sync and runs in production on a 151,478-drawer palace behind [palace-daemon](https://github.com/jphein/palace-daemon) at `disks.jphe.in:8085`. It carries 17 fork-ahead changes that compose with — not replace — bensig's release direction; four landed upstream on 2026-04-26 (#1173, #1177, #1198, #1201). 1,510 tests pass on `main`. The new things here are *what we've learned*, not just what we've fixed.
11+
This fork tracks `upstream/develop` through the 2026-04-27 sync and runs in production on a 151,478-drawer palace behind [palace-daemon](https://github.com/jphein/palace-daemon) at `disks.jphe.in:8085`. It carries 16 fork-ahead changes that compose with — not replace — bensig's release direction; four landed upstream on 2026-04-26 (#1173, #1177, #1198, #1201). 1,500 tests pass on `main`. The new things here are *what we've learned*, not just what we've fixed.
1212

1313
## What just shipped
1414

@@ -54,7 +54,7 @@ The deeper read on local-first AI memory: the sovereignty argument lands in cour
5454

5555
Three bands of work, all instances of the principles above. Detail rows in the [appendix](#fork-change-inventory) at the bottom.
5656

57-
- **Structural retrieval fixes (Principle 1, Principle 2).** Multi-collection split moves Stop-hook checkpoints to a dedicated `mempalace_session_recovery` collection — physically absent from `mempalace_search`, queryable via the new `mempalace_session_recovery_read` MCP tool. PreCompact incorporated. Auto-migrates on first daemon restart. The transitional `kind=` filter and over-fetch hack become deletable next release. `drawer_id` surfacing on every search/diary/recovery hit so callers can build citation popovers and follow-ups.
57+
- **Structural retrieval fixes (Principle 1, Principle 2).** Multi-collection split moves Stop-hook checkpoints to a dedicated `mempalace_session_recovery` collection — physically absent from `mempalace_search`, queryable via the new `mempalace_session_recovery_read` MCP tool. PreCompact incorporated. Auto-migrates on first daemon restart. The transitional `kind=` filter and over-fetch hack are gone (2026-04-27) — the structural fix made them inert. `drawer_id` surfacing on every search/diary/recovery hit so callers can build citation popovers and follow-ups.
5858
- **Single-writer architecture (Principle 3).** [palace-daemon](https://github.com/jphein/palace-daemon) is the only process that opens the palace; clients connect over HTTP. ChromaDB 1.5.x's HNSW concurrency hazards (`#974`/`#965`/`#823` family) become structurally impossible. Cold-start integrity sniff-test on segment metadata files prevents `quarantine_stale_hnsw` from destroying healthy indexes during async-flush lag. Cherry-pick of upstream [#1085](https://github.com/MemPalace/mempalace/pull/1085) for 10–30× mining speedup; cherry-pick of upstream-PR-#1094 for boundary-level None-metadata coercion that closes a per-site-guard family.
5959
- **Deterministic hook saves (Principles 1+2+3 compose).** Silent saves bypass auto-memory conflicts entirely — the LLM is no longer in the save path, so `decision: "block"` race conditions and Claude's auto-memory winning over MCP tools both go away. Save marker advances only after confirmed write. `systemMessage` notification surfaces results. PreCompact writes a recovery-collection marker before mining + compaction so context-boundary events leave a queryable timestamp.
6060

@@ -100,7 +100,7 @@ A Stop hook fires every 15 messages in Claude Code, writes directly to `mempalac
100100

101101
When the HNSW index is genuinely degraded (rare, post-fix), the same call returns `warnings: ["vector search returned 0 of 5 requested; filled 5 from sqlite+BM25 keyword match"]` with hits tagged `"matched_via": "sqlite_bm25_fallback"` — data is never silently hidden.
102102

103-
After the 2026-04-26 migration, the example queries from a week ago all return content rather than checkpoint word-soup. `kind=all` is now equivalent to `kind=content` in practice; the parameter survives one more release as a safety net, then retires.
103+
After the 2026-04-26 migration, the example queries from a week ago all return content rather than checkpoint word-soup. The `kind=` parameter retired 2026-04-27 — the structural split made it inert.
104104

105105
## Architectural principles
106106

@@ -296,7 +296,7 @@ python -m venv venv && source venv/bin/activate
296296
pip install -e ".[dev]"
297297

298298
# Develop
299-
python -m pytest tests/ -q # 1510 tests (benchmarks deselected)
299+
python -m pytest tests/ -q # 1500 tests (benchmarks deselected)
300300
mempalace status # palace health
301301
ruff check . && ruff format --check . # lint + format
302302

@@ -321,7 +321,6 @@ The canonical source is [`docs/fork-changes.yaml`](docs/fork-changes.yaml); [`FO
321321
| **Search** | Move Stop-hook auto-save checkpoints to dedicated `mempalace_session_recovery` ChromaDB collection (Principle 1+2). **Phases A–E shipped 2026-04-25 → 2026-04-26**: collection adapter, write routing, new `mempalace_session_recovery_read` MCP tool, migration (idempotent, ID/metadata-preserving), PreCompact incorporation, palace-daemon `lifespan` auto-migrate. Canonical 151K palace migrated 667 checkpoints on 2026-04-26 10:24:09 PDT. Cat 9 A/B re-run shows **632/3 → 974/1267 token convergence**. | PR pending — fork commits [`e266365`](https://github.com/jphein/mempalace/commit/e266365) (A–C) → [`42817d7`](https://github.com/jphein/mempalace/commit/42817d7) (D + PreCompact); palace-daemon [`034023c`](https://github.com/jphein/palace-daemon/commit/034023c) (E); 18 new tests | `palace.py`, `mcp_server.py`, `migrate.py`, `cli.py`, `hooks_cli.py` |
322322
| **Search** | Surface `drawer_id` in `mempalace_search` results, `mempalace_diary_read` entries, and `mempalace_session_recovery_read` payload. ChromaDB primary key was returned but never plumbed into the result-building loop. Defensive zip-with-id-pad for test mocks. | PR pending — fork commit [`9a8bb77`](https://github.com/jphein/mempalace/commit/9a8bb77); upstream [#1219](https://github.com/MemPalace/mempalace/pull/1219) (@pepo72) is the narrower searcher-only equivalent. | `searcher.py`, `mcp_server.py`, `tests/...`, `website/reference/mcp-tools.md` |
323323
| **Reliability** | `hook_precompact` writes a session-recovery checkpoint marker before mining + compaction. Mirrors `hook_stop`'s `_save_diary_direct` call; same routing path (recovery collection, queryable by `session_id`). | Bundled with phase D in [`42817d7`](https://github.com/jphein/mempalace/commit/42817d7) | `mempalace/hooks_cli.py` |
324-
| **Search** | `kind=` filter on `search_memories` excludes Stop-hook checkpoints by default. Three values: `"content"` (default), `"checkpoint"`, `"all"`. Post-filter only (chromadb 1.5.x `$nin`/`$in` filter-planner bug); over-fetch `max(n*20, 100)` for non-`"all"`. **Transitional** — becomes deletable next release after the structural split (above) ships. | PR pending — fork commits `8d02835``f9f5cc4` | `searcher.py`, `mcp_server.py` |
325324
| **Performance** | Cherry-picked upstream [#1085](https://github.com/MemPalace/mempalace/pull/1085) (@midweste) — batch ChromaDB inserts in miner. New `_build_drawer()` + `add_drawers()`. Reported 10–30× mining speedup. | Cherry-pick of open #1085 — fork commit [`6be6fff`](https://github.com/jphein/mempalace/commit/6be6fff). Becomes a no-op when #1085 merges. | `mempalace/miner.py` |
326325
| **Reliability** | Cherry-picked upstream [#1094](https://github.com/MemPalace/mempalace/pull/1094) — coerce None metadatas at chromadb boundary. Closes the per-site-guard family of None-metadata bugs (#999, #1198, #1201) at one site instead of N. | Cherry-pick of open #1094 — fork commit [`43d728d`](https://github.com/jphein/mempalace/commit/43d728d) | `backends/chroma.py`, `tests/test_backends.py` |
327326
| **CLI** | `mempalace purge --wing/--room` via `collection.delete(where=...)`. Earlier nuke-and-rebuild draft predicated on #521's race; @igorls's review traced the stack — race is on the upsert path, not delete-by-where. Simpler version preserves embedding fn, no rmtree window, routes through `ChromaBackend`. | [#1087](https://github.com/MemPalace/mempalace/pull/1087), rewritten 2026-04-26 per review | `cli.py`, `tests/test_cli.py` |

0 commit comments

Comments
 (0)