Skip to content

Commit 69768fc

Browse files
jpheinclaude
andauthored
fix(hooks): drop checkpoint write path (verbatim-only, part 1) (#6)
* fix(hooks): drop checkpoint write path from Stop + PreCompact The Stop hook used to do two things on each fire: (a) write a 1-KB checkpoint summary diary entry into the mempalace_session_recovery collection (b) auto-mine the verbatim transcript into mempalace_drawers The summary in (a) is redundant once (b) is searchable, AND the recovery collection had no semantic-search MCP surface — so checkpoints were structurally invisible to mempalace_search. Net effect from a user's seat: agents (and JP) couldn't find recent session content via search even though everything was on disk. Drop the summary half. Verbatim transcript chunks already contain every word a checkpoint summary would have surfaced. Leaving only the transcript-ingest path matches the verbatim-only architecture described in docs/superpowers/specs/2026-05-05-verbatim-only-design.md. Changes: * hook_stop silent path: remove _save_diary_direct call. Save marker advances unconditionally (fire-and-forget mining doesn't have a "confirmed save" signal to gate on; failure detection moves to daemon-side observability — hook.log + systemd journal). * hook_precompact: remove the recovery-marker write. Mine + compaction proceed as before. * systemMessage shape changes from "✦ N memories woven into the palace — themes" to "✦ Transcript ingest triggered (wing=...)". Lighter, accurate, no false count. * Delete the now-unused _save_diary_direct function (~120 LOC) and its dependencies _extract_themes + _THEME_STOPWORDS (~30 LOC). * Update 4 hook tests + 1 OSError test to mock _ingest_transcript rather than _save_diary_direct, and to expect the new systemMessage shape. Scope deliberately limited: this PR removes hook-side WRITES. mempalace_session_recovery collection still exists; the mempalace_session_recovery_read MCP tool still works against the existing 763 archived entries; mempalace.migrate is untouched. A follow-up PR will dispose of the collection itself once this change has run in production for a verification window. Net diff: −230 / +53 LOC. 1558 tests pass, 1 skipped, lint clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(spec): verbatim-only architecture design Captures the architectural shift driving this PR family — drop checkpoint summaries, expose only the verbatim transcripts that mempalace_search can already reach. Documents: * Why: the recovery collection had no semantic-search MCP surface, so checkpoints in it were structurally invisible to search * What: hook write-path deletion (this PR) + collection disposal (follow-up PR) * How: code-deletion checklist for both repos, migration plan with tarball safety net, test plan * Open questions queued for review at merge time Spec gates the destructive parts on the answers to those open questions; the write-path removal in this PR is scoped narrow enough to land without them. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(hooks): clarify hook_precompact docstring scope Copilot review on #3 caught that the docstring claimed the recovery collection was "now-deleted," but this PR only removes the write call site — collection retirement is the follow-up PR #5. Reword to match actual scope: the recovery-marker write is removed here; the collection still exists and gets disposed in #5. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 09d2ca6 commit 69768fc

3 files changed

Lines changed: 217 additions & 230 deletions

File tree

Lines changed: 163 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,163 @@
1+
# Verbatim-only mempalace — design
2+
3+
- **Project:** `mempalace` (jphein fork) + `palace-daemon` (jphein fork)
4+
- **Date:** 2026-05-05
5+
- **Status:** Spec — awaiting JP review. Phase 1 (transcript-ingest restoration) shipped same day as PRs jphein/mempalace#2 and jphein/palace-daemon#1; this Phase 2 spec describes the architectural shift that follows.
6+
- **Decided by:** JP, 2026-05-05, in response to "agents don't know how to search both parts." `mempalace_session_recovery` had no semantic-search MCP surface, so checkpoints in it were structurally invisible to `mempalace_search`. Rather than build the missing surface, eliminate the bifurcation.
7+
8+
## One-line summary
9+
10+
Drop the checkpoint summary write path entirely. Stop hook produces only verbatim transcript ingest (Phase 1) — no more 1-KB summary diaries. Retire the `mempalace_session_recovery` collection and the migration that moved checkpoints into it. The palace contains verbatim conversations + manually mined files. Period.
11+
12+
## Goals
13+
14+
1. **One collection, one search path.** `mempalace_drawers` is the only searchable store. No bifurcation, no filter-out-the-other-half logic.
15+
2. **No information lost vs status quo.** Verbatim transcripts already contain the words a checkpoint summary would have surfaced. `mempalace_search` over the transcript IS the recovery query.
16+
3. **Composable with the existing wing taxonomy.** Transcripts continue routing to `wing_<project>` (already done in Phase 1). No new metadata schema, no tags layer, no `kind=` filter.
17+
4. **Clean upstream story.** Phase 2 changes are fork-only; upstream's recovery model still serves users who want summaries. Document the divergence in CLAUDE.md and `fork-changes.yaml`; don't push the abandonment upstream.
18+
19+
## Non-goals
20+
21+
- Replacing transcript ingest. Phase 1 already restored it. Phase 2 only removes the *summary* half of the stop hook.
22+
- Building a tags system. JP raised tags as a future feature, but Phase 2 is "remove the bifurcation," not "build a richer one." Tags are a Phase 4+ design.
23+
- Reverting the wing-derivation behavior change from Phase 1 — `wing_<project>` for transcripts stays.
24+
- Touching upstream PRs that ship checkpoint routing (`rboarescu/palace-daemon` #8, #14, #18). They serve users on the upstream summary model and are no longer on JP's path; leave open.
25+
26+
## Context — what exists, what changes
27+
28+
| Thing | State after Phase 1 | Changes in Phase 2 |
29+
|---|---|---|
30+
| Stop hook → `_save_diary_direct``tool_diary_write``mempalace_session_recovery` collection | Writes a 1-KB checkpoint summary on every fire | **Deleted.** Stop hook no longer calls `_save_diary_direct`. |
31+
| Stop hook → `_ingest_transcript` → daemon `/mine` | Writes verbatim transcript chunks to `mempalace_drawers` (per-project wing) | **Unchanged.** This becomes the only Stop-hook write path. |
32+
| `mempalace_session_recovery` collection (763 drawers as of 2026-05-05) | Active; receives all checkpoints | **Emptied** — production data is verbatim summaries of conversations whose verbatim text already lives in `mempalace_drawers`. JP confirms (this spec gates on his ack) that he wants hard delete, not archive-rename. |
33+
| `tool_session_recovery_read` MCP tool | Filter-only reader (session_id / agent / since / wing) | **Deleted.** No collection to read. |
34+
| `tool_diary_write` topic-routing branch (`if topic in _CHECKPOINT_TOPICS:`) | Routes to recovery collection | **Deleted branch.** Diary writes always go to `mempalace_drawers`. The remaining diary-write callers (agent journals, decision logs) keep working. |
35+
| `mempalace.migrate.migrate_checkpoints_to_recovery` | Idempotent walk that moves checkpoints from `mempalace_drawers``mempalace_session_recovery` | **Deleted.** Production already migrated; Phase 2 reverses the migration via a one-shot delete-then-purge of the recovery collection contents. |
36+
| `mempalace repair --mode reorganize` (CLI dispatch) | Calls `migrate_checkpoints_to_recovery` | **Deleted mode.** `mempalace repair` keeps `--mode rebuild` (HNSW-from-sqlite). |
37+
| palace-daemon `lifespan` auto-migrate (`PALACE_AUTO_MIGRATE_CHECKPOINTS=1`) | Calls migrate function on startup | **Deleted.** Lifespan no longer migrates. |
38+
| `tests/test_session_recovery.py` (12 tests) | Active | **Deleted.** Tests assume the parallel-collection design. |
39+
| `tests/test_migrate.py::TestMigrateCheckpointsToRecovery` (6 tests) | Active | **Deleted.** Module being removed. |
40+
| PreCompact hook → `_save_diary_direct` recovery marker | Writes checkpoint marker on context-compaction event | **Deleted.** PreCompact still triggers transcript ingest via `_ingest_transcript`; no diary marker. |
41+
| CLAUDE.md row 23 (checkpoint collection split — phases A–E) | Records the build-up | **Updated** — row marked "reverted in Phase 2 / 2026-05-05" with cross-reference to this spec. The architectural narrative is preserved as historical record of the round trip. |
42+
| `docs/fork-changes.yaml` | Has the row 23 entry | **New entry** for Phase 2 reversion + replaces row 23 status. |
43+
44+
## Architecture
45+
46+
### Hook write paths (after Phase 2)
47+
48+
```
49+
Stop hook fire:
50+
└── _ingest_transcript(transcript_path)
51+
└── (daemon-strict) POST /mine {dir, wing=wing_<project>, mode=convos}
52+
└── (local) subprocess.Popen([mempalace, mine, dir, --mode=convos, --wing=wing_<project>])
53+
54+
PreCompact hook fire:
55+
└── _ingest_transcript(transcript_path) # same as Stop, sync version via _mine_sync if MEMPAL_DIR is set
56+
```
57+
58+
That's it. No diary write, no checkpoint summary, no recovery marker.
59+
60+
### Collection layout
61+
62+
- `mempalace_drawers` — single collection. Contains every searchable drawer: manually-mined project files (`wing_techempower`, `wing_realmwatch`, etc.), verbatim conversation chunks (`wing_<project>` per Phase 1), agent diary entries (non-checkpoint topics), knowledge-graph drawers.
63+
- `mempalace_session_recovery`**deleted at Phase 2 deploy time.** Hard-delete via the daemon's `/repair --mode purge-recovery` (new mode, see below) or a one-shot `chromadb.PersistentClient.delete_collection("mempalace_session_recovery")`.
64+
65+
### MCP surface (after Phase 2)
66+
67+
| Tool | Status | Notes |
68+
|---|---|---|
69+
| `mempalace_search` | Unchanged | Already only reads `mempalace_drawers`. The deleted `kind=` filter is already gone (CLAUDE.md row 21). |
70+
| `mempalace_diary_read` | Unchanged | Reads from `mempalace_drawers`; topic filter still works for non-checkpoint topics. |
71+
| `mempalace_diary_write` | **Simplified** | Topic-routing branch deleted. Always writes to `mempalace_drawers`. |
72+
| `mempalace_session_recovery_read` | **Deleted** | No corresponding collection. |
73+
| All other tools | Unchanged | Knowledge graph, tunnels, status, etc. — none touched the recovery collection. |
74+
75+
## Migration
76+
77+
### Production data (canonical 151K palace)
78+
79+
- `mempalace_drawers`: ~150K drawers. **Untouched.**
80+
- `mempalace_session_recovery`: 763 drawers (per CLAUDE.md row 23 cleanup observation, 2026-04-27). **Hard-deleted.**
81+
82+
JP confirmed during the design conversation (this spec gates on the explicit ack at review): the 763 checkpoint summaries don't carry information that isn't also in `mempalace_drawers` via verbatim transcript chunks. Hard-delete, no archive.
83+
84+
### One-shot migration script
85+
86+
```python
87+
# scripts/phase2_purge_recovery.py — runs once, on disks, after Phase 2 code is deployed.
88+
import chromadb
89+
client = chromadb.PersistentClient(path="/mnt/raid/projects/mempalace-data/palace")
90+
try:
91+
client.delete_collection("mempalace_session_recovery")
92+
print("Deleted mempalace_session_recovery")
93+
except ValueError:
94+
print("Already absent")
95+
```
96+
97+
Run after the daemon is updated, before clients reconnect. Reversible by running the migration backward — but with the migration code deleted, that requires checkout of pre-Phase-2 code. JP signs off on irreversibility at review.
98+
99+
### Code deletion checklist
100+
101+
**In `~/Projects/memorypalace`:**
102+
- `mempalace/migrate.py` — delete (or keep as empty module if downstream imports `from mempalace.migrate import ...` exist; check first)
103+
- `mempalace/mcp_server.py::tool_session_recovery_read` — delete handler + remove from `TOOLS` dict
104+
- `mempalace/mcp_server.py::_get_session_recovery_collection` — delete
105+
- `mempalace/palace.py::get_session_recovery_collection` — delete
106+
- `mempalace/palace.py::_SESSION_RECOVERY_COLLECTION` constant — delete
107+
- `mempalace/palace.py::_CHECKPOINT_TOPICS` and topic-routing branch in `tool_diary_write` — delete
108+
- `mempalace/cli.py` `repair` dispatch — remove `reorganize` mode
109+
- `mempalace/hooks_cli.py::_save_diary_direct` — delete the function (keep the imports it depended on if used elsewhere; audit callers first)
110+
- `mempalace/hooks_cli.py::hook_stop` — remove the `_save_diary_direct` call
111+
- `mempalace/hooks_cli.py::hook_precompact` — remove the diary-write branch
112+
- `tests/test_session_recovery.py` — delete file
113+
- `tests/test_migrate.py::TestMigrateCheckpointsToRecovery` — delete class (keep file if other migrate tests exist)
114+
- `website/reference/mcp-tools.md` — remove `mempalace_session_recovery_read` doc entry
115+
116+
**In `~/Projects/palace-daemon`:**
117+
- `main.py::lifespan` — remove the `migrate_checkpoints_to_recovery` block + `PALACE_AUTO_MIGRATE_CHECKPOINTS` env var
118+
- `main.py::repair` `reorganize` mode — remove
119+
- Add new mode `/repair?mode=purge-recovery` (one-shot, idempotent: deletes the `mempalace_session_recovery` collection if present, returns 200 either way) — for the deploy script to call cleanly
120+
121+
**In `~/Projects/memorypalace/CLAUDE.md`:**
122+
- Row 23 status updated to "reverted in Phase 2 (2026-05-05) — see specs/2026-05-05-verbatim-only-design.md"
123+
- New row entry for Phase 2 in the fork-change-queue table
124+
- `docs/fork-changes.yaml` entry
125+
126+
## Test plan
127+
128+
After deletions:
129+
- `python -m pytest tests/ -q` — full suite green; expect 12 tests removed from `test_session_recovery.py` and 6 from `test_migrate.py`. Net change: -18 tests, 0 new tests (all the surviving paths already had coverage).
130+
- Manual verify on dev palace: `mempalace_diary_write` of a checkpoint-topic entry lands in `mempalace_drawers`, not in any other collection.
131+
- Manual verify on dev palace: `mempalace_search` of recent session content returns hits (not the empty-result symptom that drove this work).
132+
133+
After deploy on disks:
134+
- `curl /repair -d '{"mode":"purge-recovery"}'` returns 200.
135+
- `curl /stats` shows only `mempalace_drawers`.
136+
- A live Stop hook fire writes only the transcript chunks to `mempalace_drawers`; nothing to recovery (because there's no collection).
137+
- `mempalace_session_recovery_read` MCP call returns "tool not found" or a clean error.
138+
139+
## Risks + mitigations
140+
141+
| Risk | Mitigation |
142+
|---|---|
143+
| Hard-delete of 763 checkpoints is irreversible | Pre-deploy: dump the recovery collection to a tarball at `/mnt/raid/backups/recovery-2026-05-05.tar.gz`. If JP later wants something back, it's recoverable from that tarball. Tarball is gitignored, manual cleanup. |
144+
| Upstream may merge the checkpoint-routing PRs (#8, #14, #18) and create merge conflicts on next sync | Document in CLAUDE.md row entry that fork is intentionally diverged. Future syncs handle conflicts by keeping fork-side deletion. |
145+
| Other code paths depend on `_CHECKPOINT_TOPICS` constant | Grep first; the constant only appears in the topic-routing branch and tests, both of which are being deleted. |
146+
| Future-self forgets the architecture and re-introduces summary writing | Memory note saved (`memory/project_verbatim_only_shift.md`). Spec preserved here. CLAUDE.md row updated. |
147+
148+
## Sequencing (post-Phase-1 verification)
149+
150+
1. **Land Phase 1 first.** PRs jphein/mempalace#2 + jphein/palace-daemon#1 reviewed by Copilot and merged. Daemon redeployed. Verify transcripts mining for ~24 hours of normal use; backfill the 11-day gap with `phase1_backfill.sh` (Task #16).
151+
2. **Open Phase 2 PRs.** Two PRs, mirroring Phase 1:
152+
- `jphein/mempalace fix/drop-checkpoint-write-path` — code + test deletions
153+
- `jphein/palace-daemon fix/drop-recovery-migration` — lifespan + repair-mode changes
154+
3. **Copilot review + JP approval, then merge.**
155+
4. **Deploy + run purge.** `pip install --upgrade /mnt/raid/projects/memorypalace`, `systemctl --user restart palace-daemon`, then `curl /repair -d '{"mode":"purge-recovery"}'`.
156+
5. **Verify post-deploy.** A few Stop hook fires, then `mempalace_search` over recent content; assert non-empty + non-checkpoint results. `curl /stats` confirms one collection only.
157+
6. **Update docs.** CLAUDE.md row 23 status, fork-changes.yaml, MEMORY.md.
158+
159+
## Open questions for JP at review
160+
161+
1. **Hard-delete confirm.** This spec assumes "delete the 763 checkpoints, no archive." Tarball-then-delete is the proposed safety net. OK?
162+
2. **Upstream PR posture.** Leave #8, #14, #18 open silently? Comment on them with the divergence reason? Close them? Recommend: leave open, no comment until/unless upstream review activity makes it timely.
163+
3. **Pre-Phase-2 verify duration.** Is "24 hours of normal Phase 1 use" enough before pulling Phase 2 trigger, or want longer? Recommend 48 hours minimum so we see at least one PreCompact event in the wild.

0 commit comments

Comments
 (0)