Skip to content

fix(mcp): tolerate None metadata cells (#1426)#1445

Merged
igorls merged 2 commits into
MemPalace:developfrom
yonefive71:fix/1426-none-metadata-mcp-handler
May 17, 2026
Merged

fix(mcp): tolerate None metadata cells (#1426)#1445
igorls merged 2 commits into
MemPalace:developfrom
yonefive71:fix/1426-none-metadata-mcp-handler

Conversation

@yonefive71

Copy link
Copy Markdown
Contributor

Summary

Fix AttributeError: 'NoneType' object has no attribute 'get' raised by several MCP handlers when ChromaDB returns None in a metadata cell. The crash happens before the embeddings_queue cleanup step, so the queue keeps growing while writes appear successful — exactly the failure mode described in the issue.

Closes #1426.

What's changing

Add a small _safe_meta() boundary helper in mempalace/mcp_server.py that coerces None (or any other non-dict) to {}, and route every direct metadatas[i] / metadatas[0] indexed read through it:

  • tool_check_duplicate (was already using or {}; switched to the helper for consistency)
  • tool_delete_drawer — audit-log path (this is the one that bricked the cleanup step in the issue)
  • tool_get_drawer
  • tool_list_drawers
  • tool_update_drawer
  • tool_diary_read

The contract — metadata is always a dict by the time it leaves the boundary — is documented on the helper and self-documents at each call site.

Why a helper instead of inline or {}

x or {} looks the same but coerces False, 0, "", and [] to {} too, which would silently swallow truly malformed rows. isinstance(meta, dict) else {} is precise about what we tolerate (None / non-dict garbage) without masking other shape bugs that should surface as test failures. Centralizing the coercion also gives one obvious place to log/alert if we want to track how often it fires in the wild.

Test plan

New regression class tests/test_mcp_server.py::TestNoneMetadataSafety (5 tests):

  1. _safe_meta unit test — None, {}, dict, str, list all coerce correctly.
  2. tool_get_drawer with metadatas=[None] — returns empty wing/room, no crash.
  3. tool_list_drawers mixed [None, {...}] — first drawer empty, second intact.
  4. tool_update_drawer with None old metadata + new wing — succeeds, new wing slotted in cleanly.
  5. tool_delete_drawer audit-log path with None metadata — reaches the col.delete(...) call (the exact behavior bricked in the issue).

Each test uses a stub collection because Chroma's write path rejects None at insert time, so we can't reproduce the upstream state through the real backend — mocking _get_collection lets us assert handler tolerance for the failure mode that actually shows up in production.

$ uv run pytest tests/test_mcp_server.py -q
113 passed in 14.56s

ruff check + ruff format --check clean.

Out of scope (filed in the issue as "Proposed Fixing Paths")

  • Path 2 — write-side interception. Worth doing as a follow-up so corrupted rows can't enter in the first place, but Chroma already rejects None at insert time on current versions; the rows that trigger [BUG] MemPalace MCP Handler AttributeError when metadata is None (Queue Stuck) #1426 in the wild are pre-existing partially-flushed entries from older formats or interrupted writes. Read-side defense is what unblocks affected users today.
  • Manual DELETE FROM embeddings_queue cleanup script. The issue mentions running it as a mitigation; that's a separate operational tool (probably belongs near repair.py), not in scope here.

Happy to follow up with either if maintainers want them folded into the same PR or split out.

ChromaDB's col.get() and col.query() can return None for the metadata
cell of a partially-flushed row, or for any row written without metadata
in older formats. The MCP handler then crashes with:

    AttributeError: 'NoneType' object has no attribute 'get'

This crashed the path before the embeddings_queue cleanup step, so the
queue grew without bound while writes kept appearing successful.

Add a small _safe_meta() boundary helper that coerces None (or any other
non-dict) to {}, and route every direct metadatas[i]/metadatas[0] read in
mcp_server through it: tool_get_drawer, tool_list_drawers,
tool_update_drawer, tool_delete_drawer (audit-log path),
tool_check_duplicate, and tool_diary_read.

The contract — *metadata is always a dict by the time it leaves the
boundary* — is documented on the helper and self-documents at each call
site.

Regression coverage: tests/test_mcp_server.py::TestNoneMetadataSafety
covers the helper and each affected handler with a stub collection that
returns None metadatas (Chroma rejects None at write time, so we can't
reproduce upstream state through the real backend).

Closes MemPalace#1426.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a _safe_meta helper function to robustly handle None values in ChromaDB metadata, preventing AttributeError crashes in various tool functions. It also includes a new suite of regression tests to ensure metadata safety. Review feedback highlights that the documents field is similarly vulnerable to None values in several locations, which could cause TypeError during string operations like slicing for logs or previews. It is recommended to coerce these document values to empty strings to ensure consistency and prevent runtime errors.

Comment thread mempalace/mcp_server.py
Comment on lines +1056 to +1058
deleted_meta = _safe_meta(
existing.get("metadatas", [{}])[0] if existing.get("metadatas") else {}
)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

While this correctly handles None metadata, the deleted_content variable on line 1055 is also vulnerable to being None in the same "partially-flushed row" scenario. If deleted_content is None, the _wal_log call at line 1064 will raise a TypeError when attempting to slice it (deleted_content[:200]).

Additionally, the deleted_meta assignment can be simplified using a more idiomatic Python pattern to handle None, empty lists, or missing keys.

    deleted_meta = _safe_meta((existing.get("metadatas") or [{}])[0])

Comment thread mempalace/mcp_server.py
drawers = []
for i, did in enumerate(result["ids"]):
meta = result["metadatas"][i]
meta = _safe_meta(result["metadatas"][i])

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The doc variable assigned on the next line (line 1181) can be None. If it is, the content_preview calculation at line 1187 will raise a TypeError when attempting to slice or check the length of None. Please ensure doc is coerced to an empty string (e.g., doc = result["documents"][i] or "").

Comment thread mempalace/mcp_server.py
return {"success": False, "error": f"Drawer not found: {drawer_id}"}

old_meta = existing["metadatas"][0]
old_meta = _safe_meta(existing["metadatas"][0])

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The old_doc variable on the next line can be None. This will cause a crash in the _wal_log call at line 1247 if content is not provided (as new_doc will inherit the None value and then be sliced). Please coerce old_doc to an empty string.

Comment thread mempalace/mcp_server.py
if not result["ids"]:
return {"error": f"Drawer not found: {drawer_id}"}
meta = result["metadatas"][0]
meta = _safe_meta(result["metadatas"][0])

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The doc variable assigned on the next line (line 1121) can also be None if the row is partially flushed. While it doesn't cause an immediate crash here, it results in a null value in the tool output, which may be unexpected for the client. For consistency with tool_check_duplicate and to ensure robustness, please coerce it to an empty string.

Comment thread mempalace/mcp_server.py
# Combine and sort by timestamp
entries = []
for doc, meta in zip(results["documents"], results["metadatas"]):
meta = _safe_meta(meta)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

While fixing metadata safety, we should also ensure doc is a string. If results['documents'] contains None, the result will contain a null content field. Coercing it here ensures the tool output remains consistent with the expected schema.

            doc = doc or ""
            meta = _safe_meta(meta)

@jphein jphein added area/mcp MCP server and tools bug Something isn't working labels May 11, 2026
@igorls igorls added this to the v3.3.6 milestone May 17, 2026
)

CI pins ruff>=0.4.0,<0.5 (resolves 0.4.10); the PR's two new assert
blocks were laid out in ruff-0.5+ style, so `ruff format --check .`
failed. Reformat with ruff 0.4.10 (the exact CI version) — assertion
layout only, no semantic change. `ruff check .` + `ruff format --check .`
both pass under 0.4.10.
@igorls igorls changed the base branch from main to develop May 17, 2026 19:19
@igorls igorls merged commit cabc3af into MemPalace:develop May 17, 2026
6 checks passed
pull Bot pushed a commit to FaZios/mempalace that referenced this pull request May 18, 2026
CI installed `ruff>=0.4.0,<0.5` (resolves 0.4.10) while contributors
run modern ruff (0.5+). ruff 0.5 changed assert-message formatting, so
valid code formatted by a current ruff fails CI's `ruff format --check .`
on layout alone. This blocked MemPalace#1438, MemPalace#1445, and MemPalace#1528 — each needing a
manual reformat-and-push cycle with no actual code defect.

Pin an exact, modern ruff in both pyproject dev lists and in ci.yml so
CI and `pip install -e ".[dev]"` format/lint identically. No new
`ruff check` violations under the existing E/F/W/C901 select.
arnoldwender pushed a commit to arnoldwender/mempalace that referenced this pull request May 24, 2026
Bumps version 3.3.5 → 3.3.6 across pyproject.toml, version.py, plugin
manifests (.claude-plugin/plugin.json, .claude-plugin/marketplace.json,
.codex-plugin/plugin.json), README badge, and uv.lock. Flips CHANGELOG.md
from ``[Unreleased]`` to ``[3.3.6] — 2026-05-24`` and backfills the
major user-facing entries that landed without changelog entries during
the cycle:

Features:
- MemPalace#1555 office-document mining via --mode extract + virtual line numbers
- MemPalace#1584 surgical closet pointers with date+line locators (Tier 6a)
- MemPalace#1558 + MemPalace#1560 within-wing hallways (entity co-occurrence graph)
- MemPalace#1565 cross-wing tunnels auto-promoted from hallways
- MemPalace#1578 Hebbian potentiation + Ebbinghaus decay on hallways/tunnels
- MemPalace#1236 API-tool transcripts auto-route to wing_api
- MemPalace#711 hooks.auto_save toggle for silent-mode sessions
- MemPalace#1605 COCA content-word filter for entity detection
- MemPalace#1557 case-insensitive entity matching at mine time
- MemPalace#1483 multilingual embeddings (embeddinggemma-300m) by default

Bug Fixes (selected, user-visible):
- MemPalace#1540 silent data loss in three unchunked upsert sites
- MemPalace#1538 paragraph chunker oversized chunks
- MemPalace#1554 per-file chunk cap too low for transcripts
- MemPalace#1562 Windows hook subprocess/ChromaDB deadlock
- MemPalace#1529 create_tunnel corrupted hyphenated wing names
- MemPalace#1424 save-hook truncated hyphenated project folders
- MemPalace#1383 KG cache duplicated graphs for symlinked/cased paths
- MemPalace#1466 silent symlink skip now logged
- MemPalace#1441 macOS stock-bash 3.2 hook compatibility
- MemPalace#1500 / MemPalace#1513 structured JSON-RPC errors on bad MCP input
- MemPalace#1523 VACUUM + FTS5 rebuild after repair
- MemPalace#1548 FTS5 validation at end of mine
- plus MemPalace#1216, MemPalace#1408, MemPalace#1438, MemPalace#1439, MemPalace#1445, MemPalace#1452, MemPalace#1459, MemPalace#1461, MemPalace#1466,
  MemPalace#1470, MemPalace#1477, MemPalace#1485, MemPalace#1500, MemPalace#1513, MemPalace#1528, MemPalace#1532, MemPalace#1543, MemPalace#1546, MemPalace#1585

Performance:
- MemPalace#1474 convo miner pre-fetches mined-set
- MemPalace#1487 rebuild_index progress callback
- MemPalace#1530 MCP cold-start diagnostics + opt-in warmup

Lint passes (ruff 0.15.14); mempalace-mcp entry point alignment
verified per RELEASING.md.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/mcp MCP server and tools bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] MemPalace MCP Handler AttributeError when metadata is None (Queue Stuck)

3 participants