Skip to content

fix(mcp): canonicalize KG cache key via realpath + normcase (#1372)#1383

Merged
igorls merged 4 commits into
MemPalace:developfrom
mvalentsev:fix/mcp-kg-cache-realpath
May 17, 2026
Merged

fix(mcp): canonicalize KG cache key via realpath + normcase (#1372)#1383
igorls merged 4 commits into
MemPalace:developfrom
mvalentsev:fix/mcp-kg-cache-realpath

Conversation

@mvalentsev

@mvalentsev mvalentsev commented May 6, 2026

Copy link
Copy Markdown
Contributor

What

_get_kg() and _call_kg() now key the per-path KnowledgeGraph cache through a
new _canonicalize_kg_path(path) helper that applies os.path.realpath
(resolves symlinks) followed by os.path.normcase (Windows drive-letter
casing + path-separator). On POSIX normcase is a no-op; the real change
is symlink resolution. On Windows both apply.

Why

#1372 (mcp: per-path KG cache should canonicalize via realpath, not abspath):

#1160 replaced the module-level KnowledgeGraph singleton with a lazy
per-path cache keyed by os.path.abspath(_resolve_kg_path()).
os.path.abspath() does not resolve symlinks or canonicalize Windows
drive-letter casing. Two tenants pointing at /srv/A and
/srv/link-to-A (or C:\palace\foo and c:\palace\foo) get separate
KnowledgeGraph instances over the same SQLite file.

The PR scope-note for #1160 explicitly accepted this trade-off as v3.4
follow-up territory.

How

def _canonicalize_kg_path(path: str) -> str:
    return os.path.normcase(os.path.realpath(path))

Applied at both _kg_by_path access sites:

  • _get_kg(canonical_path=None) -- insert + lookup. The optional arg
    lets a caller that has already captured a canonical key (see
    _call_kg below) pass it through, so the dict insert here uses the
    same key the caller will later evict under. With canonical_path=None
    the function resolves and canonicalizes from module state -- the
    default for direct callers.
  • _call_kg() -- captures the canonical key once at the top of the
    function, before the retry loop, and threads it through every
    _get_kg(path) call plus the eviction lookup. Capturing at the top
    surfaces an OSError from realpath cleanly via the dispatcher's
    exception envelope rather than letting it raise inside the
    except sqlite3.ProgrammingError branch and mask the original close
    signal. Threading the captured key through _get_kg locks the
    insert/evict keys together even if env (MEMPALACE_PALACE_PATH
    rotation) or FS state (symlink remap, mount remap) mutates between
    attempts.

Drain logic in tool_reconnect iterates _kg_by_path.values(), so it is not
path-sensitive and stays unchanged.

Tests

TestKGLazyCache gains:

  • test_canonicalize_kg_path_collapses_symlink_alias -- creates a real
    symlinked tmp_path layout; asserts both real/kg.sqlite3 and
    link/kg.sqlite3 produce the same canonical key. Skipped on Windows
    because symlink creation requires admin.
  • test_canonicalize_kg_path_routes_through_normcase -- patches both
    os.path.realpath and os.path.normcase with sentinel wrappers and
    asserts the helper composes them as normcase(realpath(p)). Swapping
    the order would leave Windows symlinks under the original case,
    defeating the dedup.
  • test_get_kg_dedupes_symlink_alias_end_to_end -- two _get_kg() calls
    via different symlink layers; asserts identical instance + a single
    KnowledgeGraph construction.
  • test_call_kg_passes_captured_path_through_resolve_drift -- patches
    _resolve_kg_path with a shifting iterator, pre-seeds the closed
    handle under the captured key, and asserts both _get_kg calls
    receive the captured path and the closed handle is evicted by retry.
    Without the pass-through, the second _get_kg would re-resolve to
    the drifted path and the closed handle would leak.
  • test_call_kg_oserror_at_top_propagates_unmasked -- patches
    _canonicalize_kg_path to raise OSError; asserts it propagates
    unchanged. Pins the rationale that capturing-at-the-top means FS
    errors surface cleanly instead of masking a ProgrammingError
    mid-retry.

The existing _call_kg retry tests
(test_call_kg_retries_after_concurrent_close,
test_call_kg_does_not_retry_on_other_errors) now use
_canonicalize_kg_path(path) for cache setup so lookup keys match the
production code path on every platform -- under os.path.abspath they
would diverge from production keys on Windows.

Local: full suite pytest tests/ --ignore=tests/benchmarks -- 1556 pass +
1 skip + 3 pre-existing chromadb/httpx env failures on Python 3.14
linuxbrew (same on every branch, unrelated). ruff check . && ruff format --check . clean. Symlink smoke test confirms /tmp/symtest_link/...
collapses to /tmp/symtest_real/....

Out of scope

  • Default-path resolution when --palace is not passed -- separate
    concern.
  • Path normalization elsewhere in the codebase (palace_path, WAL dir,
    etc.) -- separate concerns.

Closes #1372.

@mvalentsev mvalentsev marked this pull request as ready for review May 6, 2026 12:32
@mvalentsev mvalentsev force-pushed the fix/mcp-kg-cache-realpath branch 4 times, most recently from f7b59ef to 576c8aa Compare May 14, 2026 21:13
@igorls igorls added bug Something isn't working area/kg Knowledge graph labels May 15, 2026
@mvalentsev mvalentsev force-pushed the fix/mcp-kg-cache-realpath branch from 576c8aa to e958e34 Compare May 15, 2026 13:33
…e#1372)

Switch _kg_by_path lookups in _get_kg and _call_kg from os.path.abspath
to a new _canonicalize_kg_path helper applying realpath (collapses
symlink aliases) + normcase (Windows drive-letter casing). Two tenants
pointing at /srv/A and /srv/link-to-A no longer open duplicate
sqlite3.Connection handles over the same file.
Patch both realpath and normcase with sentinel wrappers and assert the
result composes them as normcase(realpath(p)). Swapping the order would
leave Windows symlinks under the original case, defeating the dedup.
The previous assertion (result == seen[-1].lower()) was tautological --
the fake normcase already returned p.lower(), so any implementation
that lowercased anything would pass.
Move _canonicalize_kg_path call from inside the except branch to the
top of _call_kg so OSError from realpath (e.g. transient Windows
junction loss) propagates cleanly instead of masking the original
sqlite3.ProgrammingError mid-handler. Also keeps the eviction key
identical to the insertion key if FS state mutates between attempts,
preventing a closed handle from leaking into the cache under a stale
key the lookup no longer matches.
…t keys

_get_kg now accepts an optional canonical_path argument and uses it
verbatim when given. _call_kg passes its top-of-function captured key
through, so the dict insertion that _get_kg performs and the eviction
lookup _call_kg performs always agree on the key -- even if env state
(MEMPALACE_PALACE_PATH rotation) or FS state (symlink remap) mutates
between attempts. Without the pass-through, _get_kg's internal call
to _canonicalize_kg_path(_resolve_kg_path()) could land on a different
key than the captured path, leaving a closed handle in the cache under
a stale key the eviction lookup no longer matches.

Tests:
  test_call_kg_passes_captured_path_through_resolve_drift -- patches
  _resolve_kg_path with a shifting iterator + identity canonicalize,
  pre-seeds the closed handle under the captured key; asserts both
  _get_kg calls receive the captured key and the closed handle is
  evicted by retry.

  test_call_kg_oserror_at_top_propagates_unmasked -- patches
  _canonicalize_kg_path to raise OSError; asserts it propagates without
  conversion (the rationale for capturing at the top before the retry
  loop -- prevents an OSError mid-handler from masking a
  sqlite3.ProgrammingError it can't retry through).

_patch_mcp_server test helper updated to a varargs lambda so existing
tests that patch _get_kg as a no-arg fixture stay compatible with the
new optional-arg signature.
@mvalentsev mvalentsev force-pushed the fix/mcp-kg-cache-realpath branch from e958e34 to 4dc3ccf Compare May 15, 2026 17:51
@igorls igorls added this to the v3.3.6 milestone May 15, 2026
@igorls igorls merged commit d114a59 into MemPalace:develop May 17, 2026
6 checks passed
arnoldwender pushed a commit to arnoldwender/mempalace that referenced this pull request May 24, 2026
Bumps version 3.3.5 → 3.3.6 across pyproject.toml, version.py, plugin
manifests (.claude-plugin/plugin.json, .claude-plugin/marketplace.json,
.codex-plugin/plugin.json), README badge, and uv.lock. Flips CHANGELOG.md
from ``[Unreleased]`` to ``[3.3.6] — 2026-05-24`` and backfills the
major user-facing entries that landed without changelog entries during
the cycle:

Features:
- MemPalace#1555 office-document mining via --mode extract + virtual line numbers
- MemPalace#1584 surgical closet pointers with date+line locators (Tier 6a)
- MemPalace#1558 + MemPalace#1560 within-wing hallways (entity co-occurrence graph)
- MemPalace#1565 cross-wing tunnels auto-promoted from hallways
- MemPalace#1578 Hebbian potentiation + Ebbinghaus decay on hallways/tunnels
- MemPalace#1236 API-tool transcripts auto-route to wing_api
- MemPalace#711 hooks.auto_save toggle for silent-mode sessions
- MemPalace#1605 COCA content-word filter for entity detection
- MemPalace#1557 case-insensitive entity matching at mine time
- MemPalace#1483 multilingual embeddings (embeddinggemma-300m) by default

Bug Fixes (selected, user-visible):
- MemPalace#1540 silent data loss in three unchunked upsert sites
- MemPalace#1538 paragraph chunker oversized chunks
- MemPalace#1554 per-file chunk cap too low for transcripts
- MemPalace#1562 Windows hook subprocess/ChromaDB deadlock
- MemPalace#1529 create_tunnel corrupted hyphenated wing names
- MemPalace#1424 save-hook truncated hyphenated project folders
- MemPalace#1383 KG cache duplicated graphs for symlinked/cased paths
- MemPalace#1466 silent symlink skip now logged
- MemPalace#1441 macOS stock-bash 3.2 hook compatibility
- MemPalace#1500 / MemPalace#1513 structured JSON-RPC errors on bad MCP input
- MemPalace#1523 VACUUM + FTS5 rebuild after repair
- MemPalace#1548 FTS5 validation at end of mine
- plus MemPalace#1216, MemPalace#1408, MemPalace#1438, MemPalace#1439, MemPalace#1445, MemPalace#1452, MemPalace#1459, MemPalace#1461, MemPalace#1466,
  MemPalace#1470, MemPalace#1477, MemPalace#1485, MemPalace#1500, MemPalace#1513, MemPalace#1528, MemPalace#1532, MemPalace#1543, MemPalace#1546, MemPalace#1585

Performance:
- MemPalace#1474 convo miner pre-fetches mined-set
- MemPalace#1487 rebuild_index progress callback
- MemPalace#1530 MCP cold-start diagnostics + opt-in warmup

Lint passes (ruff 0.15.14); mempalace-mcp entry point alignment
verified per RELEASING.md.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/kg Knowledge graph bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

mcp: per-path KG cache should canonicalize via realpath, not abspath

2 participants