Skip to content

fix(mcp): harden error messages and add read-only server mode#181

Open
travisbreaks wants to merge 1 commit into
MemPalace:developfrom
travisbreaks:fix/mcp-error-hardening
Open

fix(mcp): harden error messages and add read-only server mode#181
travisbreaks wants to merge 1 commit into
MemPalace:developfrom
travisbreaks:fix/mcp-error-hardening

Conversation

@travisbreaks

Copy link
Copy Markdown
Contributor

Summary

Error hardening

  • Replaces 5 instances of str(e) in MCP tool error responses with generic messages
  • Raw exception strings can leak internal file paths, DB schema details, and ChromaDB internals to MCP clients
  • Full exception details are still logged to stderr via logger.exception() for operator debugging

Read-only mode

  • Adds --read-only CLI flag: python -m mempalace.mcp_server --read-only
  • Write tools (add_drawer, delete_drawer, kg_add, kg_invalidate, diary_write) return a JSON-RPC -32600 error in read-only mode
  • Write tools are hidden from tools/list in read-only mode
  • Read tools continue to work normally

This is useful for shared deployments where the MCP server should expose palace data to AI agents without allowing mutations.

Context

Cherry-picked from the security items in PR #34, which was asked to be split into focused PRs. The error hardening and read-only mode are the remaining unmerged security improvements from that PR.

Test plan

  • pytest tests/ -v passes (4 new tests, 114 total)
  • ruff check and ruff format --check pass
  • New tests cover:
    • Read-only mode blocks all write tools
    • Read-only mode allows read tools
    • Read-only mode hides write tools from tools/list
    • Error responses don't leak exception strings

🤖 Generated with Claude Code

Security:
- Replace 5 instances of `str(e)` in MCP tool error responses with
  generic messages. Raw exception strings can leak internal file paths,
  DB schema details, and ChromaDB internals to MCP clients.
- Full exception details are still logged to stderr via logger.exception()
  for debugging.

Read-only mode:
- Add `--read-only` CLI flag to mcp_server.py
- In read-only mode, write tools (add_drawer, delete_drawer, kg_add,
  kg_invalidate, diary_write) return a JSON-RPC error and are hidden
  from tools/list
- Read tools continue to work normally

Tests:
- 4 new tests: read-only blocks writes, allows reads, hides write tools
  from listing, and error responses don't leak exception strings

Co-Authored-By: Tadao <tadao@travisfixes.com>
@bgauryy

bgauryy commented Apr 8, 2026

Copy link
Copy Markdown

PR Review: fix(mcp): harden error messages and add read-only server mode

Executive Summary

Aspect Value
PR Goal Replace str(e) error leaks in MCP tool responses with generic messages, add --read-only server mode
Files Changed 2 (mempalace/mcp_server.py +44/-12, tests/test_mcp_server.py +87/-0)
Risk Level 🟡 MEDIUM — error hardening is clean, but fork is based on stale main introducing regressions
Review Effort 3/5 — requires comparing fork state vs current main to isolate intentional changes
Recommendation 🔄 REQUEST_CHANGES

Affected Areas: mcp_server.py (tool handlers, handle_request, main()), test_mcp_server.py (4 new test classes)

Business Impact: Prevents leaking internal paths/schema details to MCP clients; enables safe read-only deployments for shared/demo palaces.

Flow Changes: Write tools (add_drawer, delete_drawer, kg_add, kg_invalidate, diary_write) blocked in read-only mode. tools/list hides them. Error responses no longer expose exception internals.

Ratings

Aspect Score
Correctness 3/5
Security 5/5
Performance 2/5
Maintainability 4/5

PR Health

  • Has clear description
  • References ticket/issue (if applicable)
  • Appropriate size (or justified if large)
  • Has relevant tests (if applicable)

High Priority Issues

(Must fix before merge)

🐛 #1: _get_collection() lost client/collection caching — performance regression

Location: mempalace/mcp_server.py:_get_collection | Confidence: ✅ HIGH

The fork's _get_collection() creates a new chromadb.PersistentClient on every call. Current main caches both the client and collection in module globals (_client_cache, _collection_cache), so repeated tool invocations reuse the connection. This PR silently removes that caching — likely because the fork was branched before caching was added. Every MCP tool call will now pay the PersistentClient init cost, which is significant for ChromaDB (file locks, HNSW index load).

- def _get_collection(create=False):
-     """Return the ChromaDB collection, or None on failure."""
-     try:
-         client = chromadb.PersistentClient(path=_config.palace_path)
-         if create:
-             return client.get_or_create_collection(_config.collection_name)
-         return client.get_collection(_config.collection_name)
-     except Exception:
-         return None
+ # Rebase onto current main to preserve the _client_cache / _collection_cache pattern.
+ # Only change the error-handling lines within the existing _get_collection().

Fix: Rebase the branch onto current main so caching is preserved. Apply only the intended error-hardening + read-only changes.


🐛 #2: tool_add_drawer unintentionally changed duplicate detection and ID generation

Location: mempalace/mcp_server.py:tool_add_drawer | Confidence: ✅ HIGH

On main, drawer IDs are deterministic (content hash only) and the insert uses upsert — making the operation idempotent. The fork switches to timestamp-based IDs (content[:100] + datetime.now().isoformat()) and add with a semantic similarity pre-check via tool_check_duplicate(threshold=0.9). This is a behavioral change not described in the PR:

  1. Same content added twice within the same second gets different IDs (timestamp resolution)
  2. Semantic dedup (0.9 threshold) is fuzzier than exact-match — could reject legitimately different content or miss exact duplicates with different embeddings
  3. add instead of upsert means a collision would raise instead of silently succeed
- # Current main (deterministic + idempotent)
- drawer_id = f"drawer_{wing}_{room}_{hashlib.md5(content.encode()).hexdigest()[:16]}"
- col.upsert(ids=[drawer_id], ...)
+ # PR (timestamp-based + semantic dedup)
+ drawer_id = f"drawer_{wing}_{room}_{hashlib.md5((content[:100] + datetime.now().isoformat()).encode()).hexdigest()[:16]}"
+ dup = tool_check_duplicate(content, threshold=0.9)
+ col.add(ids=[drawer_id], ...)

Fix: Rebase onto current main. Keep the original ID generation and upsert strategy. Only apply the str(e) → generic message change within the except block.


Medium Priority Issues

(Should fix, not blocking)

🚨 #3: TestErrorHardening doesn't exercise the actual exception path

Location: tests/test_mcp_server.py:TestErrorHardening.test_error_responses_do_not_leak_exceptions | Confidence: ✅ HIGH

The test calls delete_drawer with a non-existent ID ("nope"), but this hits the normal if not existing["ids"]: return {"success": False, "error": f"Drawer not found: {drawer_id}"} path — not the except Exception path that was hardened. The test passes because "Drawer not found: nope" trivially doesn't contain "Traceback" or "Exception".

To actually verify the hardening, the test should force an exception (e.g., monkeypatch col.delete to raise) and verify the generic message is returned instead of the exception string.

+ def test_exception_does_not_leak_to_client(self, monkeypatch, config, palace_path, kg):
+     _patch_mcp_server(monkeypatch, config, palace_path, kg)
+     col = _get_collection(palace_path, create=True)
+     col.add(ids=["test_id"], documents=["test"], metadatas=[{"wing": "w", "room": "r"}])
+     # Force col.delete to raise
+     monkeypatch.setattr(col, "delete", lambda **kw: (_ for _ in ()).throw(RuntimeError("secret/path/leak")))
+     from mempalace.mcp_server import tool_delete_drawer
+     result = tool_delete_drawer("test_id")
+     assert result["success"] is False
+     assert "secret" not in result["error"]

🎨 #4: Read-only error uses -32600 (Invalid Request) instead of a custom code

Location: mempalace/mcp_server.py:handle_request tools/call branch | Confidence: ⚠️ MED

JSON-RPC -32600 means "Invalid Request" (malformed JSON-RPC). A write tool call in read-only mode is a valid request that the server chooses to reject. A custom server error code (e.g., -32001) or returning a normal result with an error payload would be more semantically accurate. MCP clients that parse -32600 may retry or misattribute the failure.

-             "error": {
-                 "code": -32600,
-                 "message": f"Server is in read-only mode. Tool '{tool_name}' is not available.",
-             },
+             "error": {
+                 "code": -32001,
+                 "message": f"Server is in read-only mode. Tool '{tool_name}' is not available.",
+             },

Low Priority Issues

(Nice to have)

🎨 #5: _read_only cleanup in tests is manual

Location: tests/test_mcp_server.py:TestReadOnlyMode | Confidence: ⚠️ MED

Each read-only test manually resets _read_only = False at the end. If a test fails mid-way, the reset is skipped, potentially polluting subsequent tests. Since monkeypatch is already being used (which auto-reverts), the explicit reset is redundant — but only because monkeypatch.setattr is used to set it. The pattern is safe but slightly misleading.


Created by Octocode MCP https://octocode.ai 🔍🐙

@web3guru888 web3guru888 left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔧 Review of #181fix(mcp): harden error messages and add read-only server mode

Scope: +131/−12 · 2 file(s) · touches core

  • ⚠️ mempalace/mcp_server.py (modified: +44/−12)
  • tests/test_mcp_server.py (modified: +87/−0)

Technical Analysis

  • 🔌 MCP server dispatch changes — verify JSON-RPC compliance and backward compatibility

Issues

  • ⚠️ Touches mempalace/mcp_server.py — Core MCP server — maintainer guards this closely

Suggestions

  • Magic number(s) 32600 — consider extracting to named constant(s)
  • 📋 PR checklist: 2/3 completed — 1 item(s) still unchecked

Strengths

  • ✅ Includes test coverage

🟡 Needs attention — touches guarded files and has items to address.


🏛️ Reviewed by MemPalace-AGI · Autonomous research system with perfect memory · Showcase: Truth Palace of Atlantis

@bensig bensig changed the base branch from main to develop April 11, 2026 22:23
@igorls igorls added area/mcp MCP server and tools bug Something isn't working security Security related labels Apr 14, 2026
@igorls

igorls commented May 8, 2026

Copy link
Copy Markdown
Member

Hi, thanks for the contribution.

This PR has merge conflicts with develop, and the branch has not been updated in over 7 days, which puts it before our most recent release. The conflicts are likely against work that landed in that release.

Could you rebase onto develop so we can take another look?

If this change is no longer relevant, feel free to close the PR.

(This message is part of a periodic backlog pass, sent to all open PRs that match this state.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/mcp MCP server and tools bug Something isn't working needs-rebase PR has merge conflicts with develop and needs rebase security Security related

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants