Skip to content

feat(mcp): persist + return drawer metadata#1

Merged
rglaubitz merged 23 commits into
mainfrom
codex/metadata-passthrough
May 26, 2026
Merged

feat(mcp): persist + return drawer metadata#1
rglaubitz merged 23 commits into
mainfrom
codex/metadata-passthrough

Conversation

@rglaubitz

Copy link
Copy Markdown
Owner

Summary

  • add optional metadata to mempalace_add_drawer
  • persist caller metadata as one JSON blob in drawer metadata storage
  • return decoded metadata on mempalace_search hits, with legacy/no-metadata drawers returning metadata: null
  • bump package/version badge/lock to 3.4.0 and tag v3.4.0

Audax Track C context

This unblocks rglaubitz/audax-apex Track C: MempalaceIngestSink needs to pass planner provenance through the MemPalace boundary so the Phase 9.5 retrieval driver can read fields like source_class, sensitivity, injectable, reference_only, and derived_from from search hits.

Migration / storage approach

MemPalace currently stores drawers in ChromaDB per-document metadata. This PR stores the entire caller-owned metadata dict as a single JSON string under the metadata key. Chroma metadata is schema-less, so there is no SQLite ALTER TABLE; the lazy migration behavior is effectively read-compatible: existing drawers simply lack the key and decode to metadata: null on search.

Base branch note

Per the dispatch brief, this branch starts from audax-apex's pinned MemPalace SHA 1888b671e2051565480c70e1bab699c1796eef7f (origin/develop in this fork) and targets main.

Verification

  • RED: new metadata passthrough tests failed before implementation (unexpected keyword argument 'metadata', missing search metadata key)
  • uv run pytest tests/test_mcp_server.py::TestWriteTools::test_add_drawer_with_metadata_round_trips_to_search tests/test_mcp_server.py::TestWriteTools::test_add_drawer_without_metadata_round_trips_none tests/test_mcp_server.py::TestWriteTools::test_legacy_drawer_without_metadata_key_searches_with_metadata_none -q
  • uv run pytest tests/test_mcp_server.py tests/test_searcher.py -q -> 101 passed
  • uv run pytest tests/test_version_consistency.py -q -> 2 passed
  • uv run pytest -q -> 1492 passed, 1 skipped, 106 deselected
  • uv run ruff check .
  • manual temp-palace smoke confirmed audax-apex-style metadata round-trips through add + search

igorls and others added 21 commits May 26, 2026 01:59
…lace#1299)

`mcp_server._get_collection` bypassed `ChromaBackend.get_collection`
and called `client.get_collection` / `client.create_collection` without
`embedding_function=`. ChromaDB 1.x does not persist the EF identity
with the collection, so the MCP server's reopen silently bound
chromadb's built-in `DefaultEmbeddingFunction` while the miner / Stop
hook ingest path bound `mempalace.embedding.get_embedding_function()`.

On bleeding-edge interpreters (python 3.14 + chromadb 1.5.x on Apple
Silicon, per MemPalace#1299) the default EF's lazy ONNX provider selection could
SIGSEGV the host process on first `col.add()`, killing the MCP stdio
server and leaving every subsequent tool call returning
`Connection closed` until Claude Code was relaunched. Reads worked
because `col.get(ids=...)` and metadata fetches don't invoke the EF;
the auto-ingest path worked because mining routes through the backend
abstraction. Diary writes were the consistent failure surface.

Resolve the EF up front (matching `ChromaBackend._resolve_embedding_function`)
and pass it into both reopen branches. Falls back to the chromadb default
only if `mempalace.embedding.get_embedding_function` itself raises.

Regression test patches the chromadb client class to capture
`embedding_function=` on every `get_collection` / `create_collection`
call from `_get_collection(create=True)` and `_get_collection()`, and
fails if any call omits it.

Follow-up to MemPalace#1262 / MemPalace#1289 (which fixed the metadata-mismatch SIGSEGV
path); this addresses the EF-mismatch SIGSEGV path on the same surface.
- Resolve the EF inside the two reopen branches that actually call
  `client.get_collection` / `client.create_collection`, so warm-cache
  reads stay zero-cost (no `MempalaceConfig()` / `_resolve_providers`
  on every tool call).
- Reuse `ChromaBackend._resolve_embedding_function()` instead of
  duplicating its try/except + log message + None-fallback.
- Reword the inline + CHANGELOG explanation to clarify that ChromaDB 1.x
  persists the EF *identity* (its `name()`) but not the *instance/
  configuration* — `mempalace.embedding` documents this and spoofs
  `name()` to `"default"` precisely so the identity check passes; the
  bug was the *provider list* (lazy ONNX selection) silently differing.
MemPalace#1076) (MemPalace#1077)

Shell splits hook command on whitespace after variable expansion, breaking
paths with spaces (e.g. C:\Users\Richard M on Windows). Wrapping the path
in double quotes preserves the token boundary.

Fixes the reported Stop/PreCompact pair in .claude-plugin/hooks/hooks.json
and applies the same fix to .codex-plugin/hooks.json (SessionStart/Stop/
PreCompact), which carries the identical bug.
…ad them (MemPalace#1244)

`cmd_compress` was writing AAAK-compressed drawers to a `mempalace_compressed`
collection, but every read path (`palace.get_closets_collection`,
`searcher.py`, `repair.py`) reads from `mempalace_closets`. Result: for
non-mined palaces (or any palace where the user ran `mempalace compress`
expecting to backfill the closet/index layer), the compressed output was
silently invisible — written to a collection nothing else opens.

Fix the writer rather than renaming the readers: "closets" is the
user-visible feature name baked into the public API
(`get_closets_collection`), the searcher hybrid path, repair/HNSW
diagnostics, and docs. Renaming the readers would churn 15+ call sites
and the README for no benefit. The compressed AAAK strings are exactly
what closets are conceptually — compact pointers scanned by an LLM to
locate the right drawer — so they belong in `mempalace_closets`.

Tests:
- Update `test_cmd_compress_stores_results` to assert the collection
  name passed to `get_or_create_collection` is `mempalace_closets`.
- Add `test_cmd_compress_output_readable_via_get_closets_collection`:
  end-to-end with a real ChromaBackend, seed a drawer, run cmd_compress,
  then read back via the same `get_closets_collection` helper that
  palace.py / searcher use. Regression test for the wrong-collection
  bug.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CI requires ruff format --check on the whole touched file. Pre-existing drift, no logic change.
…alace#1243)

`tool_diary_write` stored the `agent` metadata verbatim after `sanitize_name`
(which preserves case), while `tool_diary_read` filtered by exact match —
so writing as "Claude" and reading as "claude" silently returned zero rows.

Both endpoints now lowercase `agent_name` immediately after sanitization.
The default per-agent wing slug is also stable across casings since it's
derived from the same normalized form.

Behavior change: entries written prior to this fix under mixed-case agent
names will not match the new lowercase filter; documented under v3.3.5
in CHANGELOG with a `mempalace repair` pointer.

Adds a regression test (`test_diary_read_case_insensitive_agent`) and
updates the existing `test_diary_write_and_read` to assert the new
lowercase agent identity.

Closes MemPalace#1243

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…event SIGSEGV on stale HNSW (MemPalace#1121, MemPalace#1132, MemPalace#1263)

PR MemPalace#1173 wired quarantine_stale_hnsw into the static make_client() helper
but not into the instance _client() method. As a result every non-MCP
entry point (CLI mining, search, repair, status) — which all use
get_collection / _get_or_create_collection / _client() — skipped the
cold-start quarantine pass and could SIGSEGV on a stale HNSW segment
left over from a partial flush, replicated palace, or crashed-mid-write.

Refactor: extract the (_fix_blob_seq_ids + gated quarantine_stale_hnsw)
pre-open pass into a single private static helper
ChromaBackend._prepare_palace_for_open(). Both make_client() and
_client() now route through it, so the _quarantined_paths once-per-
palace-per-process gate is preserved (no runtime thrash on hot paths)
and behaviour stays identical — the fix is purely about extending the
existing protection to the path that was missing it.

Tests:

- test_client_quarantines_corrupt_segment_on_first_open mirrors the
  existing make_client test and verifies _client() actually renames a
  corrupt segment on first open.
- test_client_quarantines_only_on_first_call_per_palace verifies the
  cache gate prevents re-running quarantine across repeated _client()
  calls — important because _client() is hit on every backend op.

Closes MemPalace#1121. Closes MemPalace#1132. Closes MemPalace#1263.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CI requires whole-file format on touched files; pre-existing drift only.
The MCP `mempalace_status` tool was returning the server's absolute
`_config.palace_path` to any connected client on both the main
(ChromaDB-backed) path and the sqlite fallback path that runs when
HNSW divergence is detected (MemPalace#1222). On a single-user local deployment
this is self-disclosure, but in nested-agent or multi-server MCP
topologies the client is a separate trust domain and the absolute
path has no documented client-side use.

Clients that legitimately need the palace path continue to have three
documented channels: the `MEMPALACE_PALACE_PATH` env var (primary) or
its legacy `MEMPAL_PALACE_PATH` alias, the `~/.mempalace/config.json`
file, and the `--palace` CLI flag on most subcommands.

Also corrects stale docs that claimed `mempalace_reconnect` returned a
`palace_path` field; the code returns `{success, message, drawers,
vector_disabled[, vector_disabled_reason]}` on success, plus a no-palace
shape and an exception shape.

- mempalace/mcp_server.py: drop palace_path from tool_status() and
  _tool_status_via_sqlite() result dicts
- website/reference/mcp-tools.md: update documented return shapes for
  mempalace_status (fix) and mempalace_reconnect (stale-docs correction)

Authored-by: Aaron Salsitz (ICCI LLC, @icciaaron). Claude Code was used
as an authoring and review-orchestration tool, with human-in-the-loop
oversight at every step: Aaron wrote the prompts, reviewed each draft,
called for three independent review passes (drafting / post-rebase
technical / CISA-aligned disclosure-leak), and verified the final patch
behavior before commit.
The MCP `mempalace_get_drawer` tool returned the entire raw drawer
metadata blob to any connected client, and the `source_file` field
in that blob is the absolute filesystem path written by the miners
(`miner.py`, `convo_miner.py` — `source_file = str(filepath)`). On
a single-user local deployment this is self-disclosure, but in
nested-agent or multi-server MCP topologies the client is a separate
trust domain and the host's directory layout has no documented
client-side use.

Mirror the mitigation that `searcher.search_memories()` already applies
on its own return path: reduce `source_file` to its basename via
`Path(source_file).name` before handing the metadata to the client.
Citations still work — the directory layout does not leak.

Companion to #1 (omit palace_path from tool_status). Same threat class,
different surface:

- mempalace_status — palace dir path     → fixed in #1
- mempalace_get_drawer — per-drawer source_file path → this PR

Other read tools were audited and do not leak host paths:
- mempalace_search    — already basenames source_file
- mempalace_list_drawers — returns wing/room/preview only
- mempalace_diary_read   — date/timestamp/topic/content only
- mempalace_reconnect    — success/message/drawers only
- mempalace_kg_*         — entity/predicate strings, counts
- mempalace_check_duplicate — wing/room/preview only

Changes:
- mempalace/mcp_server.py: tool_get_drawer() now basenames metadata.source_file
- tests/test_mcp_server.py: regression test asserting the absolute path
  and its parent directory do not appear anywhere in the response
- website/reference/mcp-tools.md: clarify the documented return shape
…ing pool

Default search behavior is unchanged. Opt-in candidate_strategy="union"
also pulls top-K BM25-only candidates from sqlite FTS5 and merges them
into the rerank pool, catching docs with strong BM25 signal that the
vector index didn't surface in the over-fetch window.

Motivation
----------
The current hybrid path gathers candidates from the vector index only
(n_results * 3 over-fetch), then BM25-reranks within them. When the
query embeds close to the wrong content semantically, the right doc
never enters the rerank pool — *no matter how wide the over-fetch*.
Tested on a ~6K-document mixed corpus (knowledge prose + short structured
records): at *30x* over-fetch (~5% of the corpus) the target doc still
didn't surface for narrative-shaped queries targeting terminology guides.
Wider over-fetch isn't the answer; widening the pool's *source* is.

Concrete failure mode: a narrative-shaped query embeds close to records
sharing the same operational vocabulary (other narrative entries in the
corpus). A terminology / style guide is BM25-strong for the query
(rare keywords the guide repeats) but vector-distant. Vector-only
candidates don't include it; BM25 never gets to rerank it. The hybrid
path produces 0.00 recall on a probe that pure BM25 alone scores 1.00 —
the hybrid is worse than its component on the same input.

Behavior change
---------------
* New parameter ``candidate_strategy: str = "vector"`` on ``search_memories``.
  - ``"vector"`` (default): historical behavior, no change.
  - ``"union"``: also fetch top ``n_results * 3`` candidates via the
    existing ``_bm25_only_via_sqlite`` helper, dedupe by source_file,
    merge into the rerank pool. BM25-only candidates carry
    ``distance=None`` so they're scored on BM25 contribution alone
    (vec_sim coerces to 0).
* ``_hybrid_rank`` now handles ``distance=None`` explicitly, scoring
  such candidates as vector-unknown (vec_sim=0) rather than treating
  it as max-distance via shim.
* New strategies register via ``_CANDIDATE_MERGERS``; dispatch is in
  ``_apply_candidate_strategy`` so ``search_memories`` stays under the
  C901 complexity ceiling.

Bench numbers (~6K-doc internal mixed corpus, recall@10, 5 probes spanning
policy-exception lookup, temporal-decay, style retrieval, set-difference,
and pattern-recognition):

                              baseline ("vector")   "union"
  policy-exception probe        0.00                  0.50    +0.50
  temporal-decay probe          0.17                  0.50    +0.33
  style-retrieval probe         0.00                  1.00    +1.00 (PASSES)
  set-difference probe          0.00–0.06             0.06–0.09  ~
  pattern-recog probe           0.64 (stable)         0.50–0.71  variance, typ. +0.07
  macro recall                  0.16–0.17             0.51–0.56  +0.34 to +0.40

The pattern-recog variance points at a related issue worth a separate PR:
``_hybrid_rank`` computes BM25 IDF over the candidate set. Adding new
candidates re-normalizes BM25 for *existing* candidates non-monotonically.
Stable corpus-wide BM25 would remove this. Out of scope here.

Tests
-----
``tests/test_hybrid_candidate_union.py`` (6 tests, all pass):
- default behavior unchanged (explicit ``"vector"`` matches default)
- ``"union"`` surfaces a BM25-strong vector-distant doc
- ``"union"`` doesn't drop docs ``"vector"`` would have found
- empty-palace handling
- invalid ``candidate_strategy`` raises
- ``_hybrid_rank`` tolerates ``distance=None``

Existing ``test_hybrid_search.py`` (5) and ``test_searcher.py`` (27) pass.

Performance note
----------------
Each ``"union"`` query adds one sqlite open + FTS5 MATCH + metadata
fetch (via the existing ``_bm25_only_via_sqlite`` helper, which already
runs as the ``vector_disabled`` fallback path so the code is
well-trodden). Per-query overhead is small but unmeasured at corpus
scale. Default stays ``"vector"`` until a maintainer characterizes the
cost.
- Dedup union candidates by (full_path, chunk_index), not basename —
  two files sharing a basename in different dirs no longer collide,
  and a vector hit on chunk N of a file no longer blocks BM25 from
  contributing chunk M of the same file.
- Validate candidate_strategy at the top of search_memories so invalid
  values fail consistently, not only when the call routes through the
  vector path.
- Trim hits back to n_results after the union+rerank pool grows;
  preserves the existing search_memories size contract that the MCP
  limit parameter is built on.
- Skip BM25-only injection when max_distance > 0.0; BM25-only
  candidates carry distance=None and would silently bypass the
  caller's strict vector-distance threshold.

Adds 4 tests covering: validation under vector_disabled, n_results
trim, max_distance honoring, and basename-collision dedup.
…emPalace#1314)

`tool_kg_add` previously accepted only `valid_from` and `source_closet`,
silently dropping `valid_to`, `source_file`, and `source_drawer_id` at
the MCP boundary. Backfilling already-ended historical facts therefore
collapsed to "still current," and adapter provenance never reached
the SQLite layer even though `KnowledgeGraph.add_triple` already
supported every column.

`tool_kg_invalidate` returned the literal string `"today"` whenever the
caller omitted `ended`, hiding the actual stamped date from anyone trying
to verify what got persisted.

Changes:
- Extend `tool_kg_add` signature + MCP input_schema with `valid_to`,
  `source_file`, `source_drawer_id`; forward all of them to
  `_kg.add_triple` and to the WAL log.
- Resolve `ended` to `date.today().isoformat()` in `tool_kg_invalidate`
  before logging / returning, so the response always reports the actual
  date stored in `valid_to`.
- Add regression tests for valid_to round-trip, source_file /
  source_drawer_id provenance, and the resolved-ended-date contract.
- Leave TODO(MemPalace#1283) markers so the open ISO-8601 validation PR can drop
  `validate_iso_date` over `valid_from` / `valid_to` / `ended` cleanly.

The underlying `KnowledgeGraph.add_triple` already accepted these
kwargs (RFC 002 §5.5) — only the MCP edge needed wiring up.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The repo's anti-jargon meta-test bans §N markers outside the
sources/backends allowlist. mcp_server.py isn't allowlisted, so the
"RFC 002 §5.5" references added in this PR turned the test red.
Trim to "RFC 002" — section number isn't load-bearing for the description.
cmd_init was instantiating MempalaceConfig() unconditionally, ignoring
args.palace and always writing the palace under ~/.mempalace. Mirror
the env-var pattern used by mcp_server.py (and consistent with how
cmd_mine / cmd_status / cmd_search resolve --palace) so every
downstream read of cfg.palace_path inside cmd_init — Pass 0,
cfg.init(), and the post-init mine — routes to the user-specified
location.

Adds tests/test_cli.py::test_cmd_init_honors_palace_flag covering the
regression: asserts Pass 0 receives the --palace value (not
~/.mempalace) and that MEMPALACE_PALACE_PATH is set in os.environ.

Closes MemPalace#1313.
monkeypatch.delenv(name, raising=False) on a missing key registers no
undo entry, so the env var cmd_init writes leaked into test_config_from_file
on Python 3.13 / Windows / macOS.

Prime the slot with setenv before delenv so teardown rolls back the write.
@rglaubitz rglaubitz force-pushed the codex/metadata-passthrough branch from 0d68ac4 to 7a926ae Compare May 26, 2026 09:03
@rglaubitz rglaubitz marked this pull request as ready for review May 26, 2026 09:03

@rglaubitz rglaubitz left a comment

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review — request changes

Three blockers before merge, all narrowly scoped and patchable on this branch:

Contract correctness

  • F01tool_get_drawer.metadata and tool_search.results[i].metadata return different shapes for the same field name. tool_search decodes the caller's opaque metadata via decode_drawer_metadata; tool_get_drawer returns the raw Chroma metadata dict (with the caller payload still a JSON string under an inner metadata key). Downstream consumer (audax-apex Track C T3) is normalizing hits from search; if any path also goes through get_drawer, they'll silently get a different shape. Fix in one place, document the intended response shape in either case. See inline comment on mempalace/mcp_server.py for the recommended shape.

Backwards-compat / silent data loss

  • F10mempalace_compressed → mempalace_closets rename (MemPalace#1244) has no migration. palace.get_closets_collection reads only from the new name; existing palaces with compressed-drawer data become silently unreadable. mempalace repair --mode does not have a closets-rename mode (verified — only legacy and max-seq-id). Either add a migration mode, fall back to the old name on empty-new, or document the breakage as "re-mine required" in the CHANGELOG entry.
  • F11 — Diary case-insensitivity fix (MemPalace#1243) CHANGELOG says "run mempalace repair if you need to migrate legacy diary metadata", but no diary-case mode exists in mempalace repair. The promised remediation does not ship with this PR. Either implement the mode, or amend the CHANGELOG.

Scope note: this PR is effectively develop → main + the T2 metadata passthrough on top — at least 11 distinct changes bundled together. If main is the release branch and develop is the integration branch, this ships a 3.4.0 release with T2 piggybacking, not a feature PR. Worth confirming framing before merging.


MED findings (action recommended, not blocking)

  • F02metadata={} and metadata=None are distinguishable on round-trip (encode({}) → "{}", decoded back to {}). Document this in the API ref; T3 sink should send None for "no provenance," never {}.
  • F03decode_drawer_metadata silently returns None on JSON parse failure. A corrupted-metadata drawer becomes indistinguishable from a legacy drawer with no metadata key. Under a downstream policy that treats None as "legacy, allow as fallback," a corrupted injectable=False drawer silently becomes injectable. At minimum logger.warning with drawer context; consider a sentinel like {"_corrupted": True} so downstream policy can fail closed.
  • F06 — First-write-wins for metadata works via the deterministic drawer-ID idempotency path (mcp_server.py:856-861) but is untested. Add a test: same wing+room+content twice with different metadata, second metadata must be silently dropped without an error.
  • F14cmd_init mutates os.environ["MEMPALACE_PALACE_PATH"] in production code (not just tests) and never restores. Benign for one-shot CLI; leaks across calls if invoked twice in one process.
  • F15_get_collection EF-binding (MemPalace#1299 fix) no-ops if ChromaBackend._resolve_embedding_function() returns None (kwargs dict ends up empty, falls back to the old buggy call without embedding_function=). When does the resolver return None? If never in production, surface the assumption inline; if sometimes (dev fallback), the fix doesn't fully cover.
  • F17 — No test asserts that tool_get_drawer and tool_search return the same metadata shape on the same drawer. Would catch F01.

LOW / cosmetic

  • F04decode_drawer_metadata accepts dict passthrough (drawer_metadata.py:22-23). Defensive branch for test fixtures only; asymmetric with encoder. Drop or document.
  • F05json.dumps on deeply-nested dict raises RecursionError, not caught by except (TypeError, ValueError) in tool_add_drawer. Caller sees an uncaught exception rather than {success: False, error: ...}.
  • F07 — Opaque metadata payload is a new leak surface (callers can stash absolute paths inside). MemPalace cannot scrub it; document that callers are responsible.
  • F08tool_kg_add.source_file accepted verbatim; later returned by tool_kg_query. Same threat class as the tool_get_drawer basename-redaction fix this PR ships. Mirror or document.
  • F09except Exception as e: return {"error": str(e)} patterns may leak palace paths / Chroma internals in error strings. Pre-existing.
  • F12tool_kg_invalidate WAL log format changed (resolved date now logged, not raw input). Note in CHANGELOG.
  • F16ChromaBackend._quarantined_paths class-level mutable set, no lock. _prepare_palace_for_open now called from both make_client and _client. Benign for single-process MCP server; defense-in-depth threading.Lock if multi-threaded callers ever appear.
  • F19–F24 — Test gaps on edge cases: metadata={} distinct-from-None, non-dict-error response, non-JSON-serializable values, size bounds, legacy mempalace_compressed readability, pre-fix diary case-mismatch repair.

Methodology: adversarial review across seven attack lanes (contract conformance, encoding correctness, info leakage, backwards compat, concurrency, test coverage, scope discipline). Hoare-style preconditions / postconditions / invariants per public API.

KG schema migration (valid_to, source_file, source_drawer_id columns) was verified clean — knowledge_graph.py:112-113 auto-applies ALTER TABLE. Candidate-union BM25 mode (searcher.py) was reviewed and the dedup / max_distance interaction looks correct.

Comment thread mempalace/mcp_server.py Outdated
"metadata": meta,
"wing": safe_meta.get("wing", ""),
"room": safe_meta.get("room", ""),
"metadata": safe_meta,

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

F01 (HIGH) — metadata shape drift between get_drawer and search

This returns metadata = dict(meta), i.e., the full Chroma metadata dict (wing, room, source_file, chunk_index, added_by, filed_at, plus the caller's metadata JSON STRING under an inner key).

Compare to searcher.py:836:

"metadata": decode_drawer_metadata(meta.get(DRAWER_METADATA_FIELD)),

search returns the decoded caller dict under metadata. So:

  • tool_search()["results"][0]["metadata"]["source_class"] → works.
  • tool_get_drawer(id)["metadata"]["source_class"] → KeyError. Consumer must do json.loads(get_drawer(id)["metadata"]["metadata"])["source_class"].

The leak-probe test below (test_get_drawer_does_not_leak_absolute_source_file_path) asserts result["metadata"]["source_file"] == "notes.md" — confirming consumers currently use get_drawer.metadata as the system metadata dict. That convention conflicts directly with the new metadata field the Track C contract expects.

Recommended fix (pick one, document):

  1. Decode the inner metadata in tool_get_drawer and return as a sibling of wing / room. System fields become top-level keys; metadata is the caller dict (matching search). This is closer to the audax-apex contract.
  2. Keep the current get_drawer shape but rename the caller field everywhere (drawer_metadata? passthrough?).

Either way, add a test that asserts shape parity between get_drawer and search for the same drawer (F17).

Threat-model framing: same-user-agent. The concern is API consistency for the T3 consumer, not adversarial access. Don't add sanitization/validation; just align the shape.

Comment thread mempalace/cli.py
if not args.dry_run:
try:
comp_col = backend.get_or_create_collection(palace_path, "mempalace_compressed")
comp_col = backend.get_or_create_collection(palace_path, "mempalace_closets")

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

F10 (HIGH) — mempalace_compressed → mempalace_closets rename has no migration

palace.get_closets_collection (palace.py:68) reads from mempalace_closets. Existing palaces written by 3.3.x have compressed-drawer data in mempalace_compressed — they become silently invisible to the reader after this rename.

Verified: mempalace repair --mode only accepts legacy and max-seq-id (cli.py:1197-1202). No closets-rename migration mode ships with this PR.

Fix options (pick one):

  1. Add --mode closets-rename to mempalace repair: rename the collection in place, or copy old→new and drop the old.
  2. Make get_closets_collection fall back to mempalace_compressed if mempalace_closets is absent or empty, with a one-time warning log.
  3. Document the breakage explicitly in the [3.3.5] CHANGELOG entry: "Existing compressed-drawer data is not auto-migrated; run a fresh mempalace compress to repopulate the new collection."

Currently neither the CHANGELOG nor the diff acknowledges the loss path. For a 3.4.0 minor bump that's a surprise.

Comment thread CHANGELOG.md

## [3.3.5] — unreleased

### Bug Fixes

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

F11 (HIGH) — promised migration tool does not exist

This entry advertises mempalace repair as the migration path for legacy mixed-case diary entries. Verified against cli.py:1197-1202: the only --mode values accepted are legacy and max-seq-id. No diary-case mode exists.

Pre-fix diary entries under agent_name = "Claude" are unreachable under the new .lower() filter (mcp_server.py:1212, 1276), and the recommended remediation does not ship.

Fix options (pick one):

  1. Implement --mode diary-case: scan diary drawers, lowercase the stored agent metadata in place, optionally fix the default-wing slug derived from it.
  2. Amend this CHANGELOG entry: drop the mempalace repair claim, replace with "legacy entries written under mixed-case agent names are not migrated; re-write the affected entries or accept the loss."

Either resolution is fine; the current state is a CHANGELOG promise the codebase can't honor.

rglaubitz added 2 commits May 26, 2026 10:39
F03 now logs corrupted drawer metadata decode failures through the mempalace_mcp logger, truncating the stored value to value_head so operators get an identifier without leaking the full payload.

F15 is Case B: ChromaBackend._resolve_embedding_function() can return None because it catches any exception from get_embedding_function() and returns None after logging; ChromaBackend.get_collection() mirrors the same conditional kwargs behavior and does not repair that case.

For the MCP _get_collection call site, treat a None EF as a fail-closed condition and never reopen a collection without embedding_function=, so the MemPalace#1299 default-EF regression path cannot silently reappear there.

Validation: uv run pytest -q; uv run ruff check .
@rglaubitz rglaubitz merged commit bfd2fb2 into main May 26, 2026
@rglaubitz rglaubitz deleted the codex/metadata-passthrough branch May 26, 2026 18:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants