Skip to content

fix(searcher): guard against None metadata/doc in search result loops#1019

Merged
igorls merged 1 commit into
MemPalace:developfrom
cantenesse:swe/session-1-bug-mempalace-search-crashes-with-attrib
May 6, 2026
Merged

fix(searcher): guard against None metadata/doc in search result loops#1019
igorls merged 1 commit into
MemPalace:developfrom
cantenesse:swe/session-1-bug-mempalace-search-crashes-with-attrib

Conversation

@cantenesse

Copy link
Copy Markdown

Problem

mempalace search crashes with AttributeError: 'NoneType' object has no attribute 'get' when ChromaDB returns None for a metadata or document entry. This happens in several edge-case states:

  • Partial-flush — HNSW index has an entry but embedding_metadata hasn't materialized yet
  • Mid-delete — metadata rows purged before HNSW entry removed
  • Upgrade boundaries — palace written by one schema, read by another
  • Interrupted mines — process killed between segment writes

Fixes #1007, #1011

Fix

Add meta = meta or {} and doc = doc or "" defensive guards at the top of the three result-iteration loops in searcher.py:

  1. search() display loop (line ~284) — guards meta.get() and doc.strip() calls
  2. search_memories() closet hybrid loop (line ~368) — guards cmeta.get()
  3. search_memories() drawer scored loop (line ~388) — guards meta.get()

Results with None metadata degrade gracefully (showing ? placeholders) rather than crashing.

Testing

The fix is a two-line defensive guard per loop — verifiable by mocking a ChromaDB query response with {"documents": [[None]], "metadatas": [[None]], "distances": [[0.5]], "ids": [["x"]]} and confirming search() completes without raising.

@cantenesse cantenesse changed the base branch from main to develop April 18, 2026 20:10
eldar702 added a commit to eldar702/mempalace that referenced this pull request Apr 19, 2026
Chroma 1.5.x can return ``None`` inside the ``metadatas`` / ``documents``
lists of a query/get result for partially-flushed rows. The codebase
already has a systemic None-guard pattern (merged MemPalace#999, MemPalace#1013, MemPalace#1019)
but three call sites were still unguarded:

* ``mcp_server.tool_check_duplicate`` (``mcp_server.py:487-488``) —
  ``meta = results["metadatas"][0][i]`` followed by ``meta.get(...)``
  raises ``AttributeError: 'NoneType' object has no attribute 'get'``.
  The broad ``except Exception`` wrapper (line 504) swallows it and
  returns an uninformative ``"Duplicate check failed"``.

* ``layers.Layer1.generate`` (``layers.py:126``) — iterates
  ``zip(docs, metas)`` and calls ``meta.get(key)`` in the importance
  loop. A single None metadata blows up the entire wake-up render.

* ``layers.Layer2.retrieve`` (``layers.py:224``) — same pattern, same
  crash path for the on-demand render.

Apply the same ``meta = meta or {}`` / ``doc = doc or ""`` idiom used
by the merged guards in the search path. Three-line additions, no
behaviour change on well-formed results.

Tests added:

* ``test_check_duplicate_handles_none_metadata`` — mocks the collection
  query to return ``None`` for one metadata and document, asserts the
  call does not crash and the sentinel-rendered entry has wing/room "?"
  and empty content.
* ``test_layer1_handles_none_metadata`` / ``_handles_none_document``
* ``test_layer2_handles_none_metadata``

Relationship to other open PRs:

* **MemPalace#1019** guarded ``searcher.py`` loops. This PR extends the same
  guard to the three call sites MemPalace#1019 did not touch.
* **MemPalace#979** fixed ``tool_check_duplicate`` negative similarity but left
  the None-metadata path unguarded.
* Does not overlap **MemPalace#1013** (``Layer3.search_raw``) or **MemPalace#999**.
@igorls igorls added bug Something isn't working area/search Search and retrieval labels Apr 24, 2026
@igorls igorls added this to the v3.3.5 milestone May 2, 2026
ChromaDB can return None entries in metadatas/documents lists under
partial-flush, mid-delete, upgrade-boundary, and interrupted-mine
states. Add `meta = meta or {}` and `doc = doc or ""` guards in the
three result loops (search display, closet hybrid, drawer scored) so
.get() and .strip() calls never crash on None.

Fixes MemPalace#1007, MemPalace#1011
@igorls igorls force-pushed the swe/session-1-bug-mempalace-search-crashes-with-attrib branch from fd34a38 to 733e435 Compare May 6, 2026 04:59
@igorls

igorls commented May 6, 2026

Copy link
Copy Markdown
Member

Rebased onto develop (post-#1029 + #1030 hybrid-rank refactor and clamp). Conflict in searcher.py: the metadata-None guard from this PR landed upstream as part of the hits-builder list comp (meta or {} on the hits=[...] line). Final delta from this PR is now narrowly:

  1. Add doc or "" alongside meta or {} at the same hits-builder line
  2. Keep the meta = meta or {} / doc = doc or "" guards in search_memories (separate loop, not refactored)

pytest tests/test_searcher.py: 27 passed locally. Lint/format clean.

@igorls igorls merged commit 01880f6 into MemPalace:develop May 6, 2026
6 checks passed
xcarbo added a commit to xcarbo/mempalace that referenced this pull request May 7, 2026
Catches up on a heavy upstream day — 22 fixes merged in 24h plus prior
backlog. Highlights pulled in:

- MemPalace#1305 hooks: ~/.mempalace/ deletion is now a stable kill-switch (hooks
  no longer rebuild the dir hierarchy on Stop/PreCompact/SessionStart)
- MemPalace#1214 KG: reject inverted intervals (valid_to < valid_from) at write time —
  prevents silently invisible triples
- MemPalace#1067/MemPalace#1105 chroma: ChromaBackend.close_palace() now actually releases
  the SQLite file lock (PersistentClient.close() on evict + invalidation)
- MemPalace#1215 entity_registry: atomic save (tmp+fsync+rename) — no more
  corruption on crash mid-write
- MemPalace#1073/MemPalace#1107 mempalace compress: paginated drawer fetch — no longer
  trips SQLITE_MAX_VARIABLE_NUMBER on palaces >32k drawers
- MemPalace#1282 stdio: Windows console UTF-8 reconfig for cli/mcp_server/hooks_cli
- MemPalace#1164/MemPalace#1167 mcp KG: sanitize_iso_date() blocks malformed date strings
  silently producing empty result sets
- MemPalace#1136/MemPalace#1160 mcp: per-path KG cache for multi-tenant hosts that rotate
  MEMPALACE_PALACE_PATH between tool calls
- MemPalace#1286 mcp: retry _get_collection() once on transient failure
- MemPalace#1138 lint cleanup, MemPalace#1019 search-crash fix
- 4 new tools/ scripts (backup_claude_jsonls, find_orphan_claude_jsonls,
  render_jsonl, save.md)

Conflict resolution (CHANGELOG.md only — code files all auto-merged):

- 3.3.5 section: untouched (already merged in our prior commit; upstream
  added several new bug-fix entries which auto-merged cleanly)
- 3.3.4 Bug Fixes: kept upstream's new MemPalace#1305 entry; preserved our richer
  detail on topic-tunnels (MemPalace#1194/MemPalace#1195/MemPalace#1197), HNSW-bloat (MemPalace#1191),
  max_seq_id (MemPalace#1135), and auto-ingest (MemPalace#1230/MemPalace#1231) — upstream's shorter
  topic-tunnels entry was a strict subset of ours.

xdev patches preserved (still on this branch, untouched by merge):
- 6ef44cb fix(hooks): route CC transcripts via convo_miner with cwd-based wings
- 3fad61d fix(config): allow leading dash in wing names
- 3fc821a fix(config): tighten leading-char to allow dash but not underscore

Tests: 1557 passed, 1 skipped (full unit suite excluding benchmarks).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/search Search and retrieval bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug: mempalace search crashes with AttributeError: 'NoneType' object has no attribute 'get' on partial-state results (searcher.py:286)

2 participants