feat: Hermes memory provider integration by ZK-Snarky · Pull Request #3 · MemPalace/mempalace

ZK-Snarky · 2026-04-07T00:16:48Z

Adds a native MemPalace memory provider for Hermes,
the open-source AI agent framework.

What this adds:

integrations/hermes/ — drop-in memory provider plugin implementing Hermes'
MemoryProvider ABC
mempalace hermes install — one-command installer that copies the plugin,
updates Hermes config, and optionally backfills existing session history
integrations/hermes/README.md — installation and usage guide

How it works:

Every conversation turn is filed into the palace automatically (wing-classified,
semantic-search ready). The AAAK wake-up injects your identity + critical facts
at session start. 8 palace tools are exposed to the agent.

Zero PII in defaults. Wing config and identity come from the user's
~/.mempalace/ files (set up via mempalace init). The plugin ships with no
hardcoded personal information.

Tested on: Hermes v0.7.0+, MemPalace 3.0.0, Python 3.9+

bensig · 2026-04-07T06:29:24Z

Thanks for this — the Hermes integration looks solid. CI is failing on lint though, 3 issues in integrations/hermes/__init__.py:

Unused import time (line 22)
Unused import typing.Any (line 25)
Ambiguous variable name l (line 414) — just rename to line or similar

A quick ruff check --fix . should handle the first two, and the third is a one-character rename. Appreciate it!

…read safety, diary safety)

ZK-Snarky · 2026-04-07T06:57:40Z

Corrected and updated 🫡

bensig · 2026-04-07T20:58:39Z

This has been open a while and conflicts with main. The Hermes integration space has also evolved. If you'd like to continue, a rebase + update to match the current codebase would be needed.

Addresses the actual spec defects flagged in #743 review, ignoring operator-UX asks that are not plugin-contract concerns. - Goal #3: 'without data loss' → mirrors §8.2's capability-conditional lossless-vs-reembed framing. No more overpromise. - §1.5: `server_embedder` is no longer an implicit escape hatch from identity/dimension rules. Such backends MUST expose an effective identity via `effective_embedder_identity()` and are bound by the same three-state check. - §7.3: adds `maintenance_kinds: ClassVar[frozenset[str]]` advertisement mechanism. `run_maintenance(kind)` must raise UnsupportedMaintenanceKindError for unadvertised kinds. Benchmark harness reads this set rather than guessing kind names. Reserves `analyze`/`compact`/`reindex` as well-defined names. - §1.2: adds `update()` as optional method with a default get+merge+ upsert implementation. §2.1: `supports_update` redefined to gate atomic single-round-trip semantics (not mere capability), since the default impl already supports partial updates. Operator asks explicitly NOT adopted (diplomatic shims, not contract defects): `.to_dict()` compat on typed results, migration progress reporting, `BaseBackend.repair()` separate from `run_maintenance`, per-palace capability variance, identity recording on read-only ops.

…ut, tests Addresses the six Copilot review comments on the initial commit. 1) MemPalace#6 (critical) — mcp_server.py `_get_collection` bypassed ChromaBackend The MCP server creates its palace collection directly via `chromadb.PersistentClient.get_or_create_collection` in `_get_collection`, not through `ChromaBackend.get_collection`. That path was missing the `hnsw:num_threads=1` metadata, so the primary crash surface for MemPalace#974 and MemPalace#965 was untouched by the original patch. Fixed by passing `hnsw:num_threads=1` at the mcp_server create site too. Documented in a code comment that the setting is only honored at creation time — existing palaces created before this fix still need a `mempalace nuke` + re-mine to gain the protection. 2) MemPalace#3 — mine_global_lock over-serialized mines across unrelated palaces Replaced the single global lock file `mine_global.lock` with a per-palace lock keyed by `sha256(os.path.abspath(palace_path))` (`mine_palace_<hash>.lock`). Mines against the same palace still collapse to a single runner (the correctness boundary), but mines against *different* palaces are now free to run in parallel. `mine_global_lock` is kept as a backward-compatible alias for `mine_palace_lock` so any external callers that imported the previous name keep working. 3) MemPalace#1 — hook_precompact swallowed OSError but not subprocess.TimeoutExpired `subprocess.run(..., timeout=60)` raises `TimeoutExpired` on slow palaces. The previous `except OSError` clause didn't catch it, so the hook could raise and fail to emit any JSON decision — leaving the harness without a block/passthrough signal. Fixed by catching `(OSError, subprocess.TimeoutExpired)` together and always falling through to the block decision so the hook reliably emits a response. 4) MemPalace#2 + MemPalace#4 — tests - tests/test_hooks_cli.py: added `test_precompact_first_two_attempts_block`, `test_precompact_passes_through_after_cap`, and `test_precompact_counter_is_per_session` to lock in the MemPalace#955 deadlock fix. - tests/test_palace_locks.py (new): covers `mine_palace_lock` single-acquire, reuse-after-release, cross-process serialization on the same palace, non-interference across different palaces, path normalization, and the `mine_global_lock` back-compat alias. 5) MemPalace#5 — known limitation, documented but not auto-fixed Copilot suggested detecting collections missing `hnsw:num_threads=1` and calling `collection.modify(metadata=...)` to retrofit existing palaces. Verified against chromadb 1.5.7: `modify(metadata=...)` replaces metadata rather than merging, and re-passing `hnsw:space="cosine"` then raises `ValueError: Changing the distance function of a collection once it is created is not supported currently.` The HNSW runtime configuration (`configuration_json`) also does not expose `num_threads` in chromadb 1.5.x, so the flag appears to be read only at creation time. Rather than paper over the limitation with a best-effort `modify` that silently drops `hnsw:space`, documented in the mcp_server comment that pre-existing palaces need a `mempalace nuke` + re-mine to gain the protection. Fresh palaces are always protected. Testing - pytest tests/test_palace_locks.py tests/test_hooks_cli.py tests/test_backends.py tests/test_cli.py → **98 passed, 0 failed**. - Runtime validation with two concurrent `mempalace mine` calls: - Different palaces → both complete in parallel ✓ - Same palace → one completes, the other exits with "another `mine` is already running against <palace> — exiting cleanly." ✓

…stant Addresses remaining PR MemPalace#976 review items after rebase on develop. ## Copilot Review MemPalace#3 (chroma.py:134) — retrofit legacy collections `get_collection(create=False)` previously returned existing collections without re-applying `hnsw:num_threads=1`, so palaces created before the fix kept the unsafe parallel-insert path. Add `_pin_hnsw_threads()` helper that calls `collection.modify(configuration=UpdateCollectionConfiguration( hnsw=UpdateHNSWConfiguration(num_threads=1)))` best-effort on every `get_collection` call (including the MCP server's `_get_collection`). In chromadb 1.5.x the runtime config does not persist to disk across `PersistentClient` reopens, so the retrofit is re-applied each process start rather than being a one-shot migration. Fresh palaces keep the metadata-based pin as primary defense; legacy palaces now also get per-session protection without requiring `mempalace nuke` + re-mine. ## mvalentsev feedback — drop dead `MAX_PRECOMPACT_BLOCK_ATTEMPTS` After the rebase on develop, `hook_precompact` delegates to `_mine_sync` and no longer emits `decision: block`, so the attempt-cap constant was orphaned. Grep confirms 0 usages in the repo — remove it. ## Tests - `_pin_hnsw_threads` retrofits legacy collection (num_threads None -> 1) - `_pin_hnsw_threads` swallows all errors (never raises) - `ChromaBackend.get_collection(create=False)` applies retrofit on legacy palace - 62 tests pass (10 backends + 6 palace locks + 46 hooks_cli)

## MemPalace#1 docstring / implementation mismatch in _get_save_interval Docstring said "invalid / non-positive values fall back to the default" but the code clamps to 1 via max(1, value). Rewrote the docstring to document the actual behavior: missing / non-integer values fall back to default; parsed values < 1 clamp to 1 (effectively blocks every turn). Added a pointer to MEMPAL_STOP_HOOK_DISABLE for users who wanted "no block" rather than "block every turn". ## MemPalace#2 MEMPAL_STOP_HOOK_DISABLE must not drift cadence state Disabled path used to return before updating *_last_save, so toggling the flag off mid-session would make since_last cover every message accumulated while disabled and trigger an immediate retroactive block. Fixed by advancing the last-save watermark to the current exchange count while disabled — state stays synced with activity, no block is ever emitted. Re-enabling resumes the normal cadence from the current position, not from whenever the hook was last active. ## MemPalace#3 integration tests for hook_stop env-var behavior Unit tests only exercised _get_save_interval / _stop_hook_disabled helpers. Added five integration tests in test_hooks_cli_tuning.py that run hook_stop end-to-end: - custom interval triggers block at override - custom interval passes through below override - MEMPAL_STOP_HOOK_DISABLE=1 passes through even far above interval - disabled path still advances last-save watermark (state tracking) - stop_hook_active short-circuit must not touch state Also added an autouse fixture in test_hooks_cli.py that monkeypatches MEMPAL_SAVE_INTERVAL and MEMPAL_STOP_HOOK_DISABLE off, so a developer with those vars set in their shell can't break the interval assertions. 79 tests pass (46 in test_hooks_cli.py + 33 in test_hooks_cli_tuning.py).

@felipetruman

Cherry-picks the critical chroma.py fix from @felipetruman's MemPalace#976 (without the bundled mine_global_lock and precompact-attempt-cap changes, which we don't need — mine_global_lock is covered by our MemPalace#1171 backend-seam flock, and our silent_save path bypasses the block-mode precompact deadlock MemPalace#976's fix #3 addresses). Root cause: ChromaDB's multi-threaded `ParallelFor` HNSW insert path races in `repairConnectionsForUpdate` / `addPoint`, corrupting the graph under any concurrent add (including a single mine's concurrent batches). Manifests as SIGSEGV (MemPalace#974) or runaway link_lists.bin writes (MemPalace#965 — 437 GB observed on JP's palace, 1.5 TB on the Nobara install in MemPalace#976's report). The num_threads=1 pin disables ParallelFor, serializing inserts within a single process. MemPalace ingests drawers one at a time anyway, so this was never a throughput win — only a latent correctness hazard. Three call sites patched: - `ChromaBackend.get_collection` (create path): add metadata pin + call `_pin_hnsw_threads()` on every return - `ChromaBackend.create_collection`: add metadata pin - `mcp_server._get_collection`: add metadata pin + call `_pin_hnsw_threads()` on every return (direct chromadb call, bypasses the backend adapter) - `cli.cmd_compact` new_col creation: add metadata pin `_pin_hnsw_threads()` exists because ChromaDB 1.5.x doesn't persist the modified HNSW config to disk, so it must re-apply on every open. Credits @felipetruman for the root-cause diagnosis and fix design in MemPalace#976. Posted corroborating reproduction data (437 GB bloat on 135K drawers, identical failure mode) on that PR earlier today at MemPalace#976 (comment). This fork-local cherry-pick gets JP protected now; expect this to become a no-op when MemPalace#976 lands upstream and we merge develop→main. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…ut, tests Addresses the six Copilot review comments on the initial commit. 1) MemPalace#6 (critical) — mcp_server.py `_get_collection` bypassed ChromaBackend The MCP server creates its palace collection directly via `chromadb.PersistentClient.get_or_create_collection` in `_get_collection`, not through `ChromaBackend.get_collection`. That path was missing the `hnsw:num_threads=1` metadata, so the primary crash surface for MemPalace#974 and MemPalace#965 was untouched by the original patch. Fixed by passing `hnsw:num_threads=1` at the mcp_server create site too. Documented in a code comment that the setting is only honored at creation time — existing palaces created before this fix still need a `mempalace nuke` + re-mine to gain the protection. 2) MemPalace#3 — mine_global_lock over-serialized mines across unrelated palaces Replaced the single global lock file `mine_global.lock` with a per-palace lock keyed by `sha256(os.path.abspath(palace_path))` (`mine_palace_<hash>.lock`). Mines against the same palace still collapse to a single runner (the correctness boundary), but mines against *different* palaces are now free to run in parallel. `mine_global_lock` is kept as a backward-compatible alias for `mine_palace_lock` so any external callers that imported the previous name keep working. 3) MemPalace#1 — hook_precompact swallowed OSError but not subprocess.TimeoutExpired `subprocess.run(..., timeout=60)` raises `TimeoutExpired` on slow palaces. The previous `except OSError` clause didn't catch it, so the hook could raise and fail to emit any JSON decision — leaving the harness without a block/passthrough signal. Fixed by catching `(OSError, subprocess.TimeoutExpired)` together and always falling through to the block decision so the hook reliably emits a response. 4) MemPalace#2 + MemPalace#4 — tests - tests/test_hooks_cli.py: added `test_precompact_first_two_attempts_block`, `test_precompact_passes_through_after_cap`, and `test_precompact_counter_is_per_session` to lock in the MemPalace#955 deadlock fix. - tests/test_palace_locks.py (new): covers `mine_palace_lock` single-acquire, reuse-after-release, cross-process serialization on the same palace, non-interference across different palaces, path normalization, and the `mine_global_lock` back-compat alias. 5) MemPalace#5 — known limitation, documented but not auto-fixed Copilot suggested detecting collections missing `hnsw:num_threads=1` and calling `collection.modify(metadata=...)` to retrofit existing palaces. Verified against chromadb 1.5.7: `modify(metadata=...)` replaces metadata rather than merging, and re-passing `hnsw:space="cosine"` then raises `ValueError: Changing the distance function of a collection once it is created is not supported currently.` The HNSW runtime configuration (`configuration_json`) also does not expose `num_threads` in chromadb 1.5.x, so the flag appears to be read only at creation time. Rather than paper over the limitation with a best-effort `modify` that silently drops `hnsw:space`, documented in the mcp_server comment that pre-existing palaces need a `mempalace nuke` + re-mine to gain the protection. Fresh palaces are always protected. Testing - pytest tests/test_palace_locks.py tests/test_hooks_cli.py tests/test_backends.py tests/test_cli.py → **98 passed, 0 failed**. - Runtime validation with two concurrent `mempalace mine` calls: - Different palaces → both complete in parallel ✓ - Same palace → one completes, the other exits with "another `mine` is already running against <palace> — exiting cleanly." ✓

… 0-chunk files Three upstream fixes ported together because they're conceptually one "convo_miner polish" pass on the same exchange-chunking path. 1. Remove ai_lines[:8] truncation (upstream d52d6c9, PR MemPalace#695). The _chunk_by_exchange path was silently dropping every line past line 8 of the AI response, violating the verbatim-storage principle. 2. Split oversize exchanges across drawers (upstream 9b60c6e, PR MemPalace#708). Now that the full response is preserved, an exchange that exceeds CHUNK_SIZE (800 chars, aligned with miner.py) is split into consecutive drawers instead of a single oversized one. Adds CHUNK_SIZE module constant. 3. Register a no-embedding sentinel for files that produce zero chunks (upstream 87e8baf, PR MemPalace#732). mine_convos has three early-exit paths (OSError, content too short, zero chunks) that previously wrote nothing — file_already_mined() then returned False on the next run and the file was re-read every time. Adapted MemPalace#3 for the PG backend: the upstream sentinel uses collection.upsert() (ChromaDB API). This fork instead adds a PalaceDB.register_empty_file() method that inserts a row directly with embedding=NULL and metadata.ingest_mode='registry', so the sentinel is free of embedding cost and invisible to vector search. file_already_mined() already keys on source_file + source_mtime, so the existing path picks up the sentinel without further changes. Three behavioural tests added: full AI response preserved, oversize exchange split across drawers, and the sentinel + file_already_mined round trip. Upstream: MemPalace@d52d6c9 MemPalace@9b60c6e MemPalace@87e8baf Co-authored-by: shafdev <96260000+shafdev@users.noreply.github.com> Co-authored-by: Sanjay Ramadugu <sramadugu1@gmail.com> Co-authored-by: Mikhail Valentsev <michael@valentsev.ru> Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ut, tests Addresses the six Copilot review comments on the initial commit. 1) MemPalace#6 (critical) — mcp_server.py `_get_collection` bypassed ChromaBackend The MCP server creates its palace collection directly via `chromadb.PersistentClient.get_or_create_collection` in `_get_collection`, not through `ChromaBackend.get_collection`. That path was missing the `hnsw:num_threads=1` metadata, so the primary crash surface for MemPalace#974 and MemPalace#965 was untouched by the original patch. Fixed by passing `hnsw:num_threads=1` at the mcp_server create site too. Documented in a code comment that the setting is only honored at creation time — existing palaces created before this fix still need a `mempalace nuke` + re-mine to gain the protection. 2) MemPalace#3 — mine_global_lock over-serialized mines across unrelated palaces Replaced the single global lock file `mine_global.lock` with a per-palace lock keyed by `sha256(os.path.abspath(palace_path))` (`mine_palace_<hash>.lock`). Mines against the same palace still collapse to a single runner (the correctness boundary), but mines against *different* palaces are now free to run in parallel. `mine_global_lock` is kept as a backward-compatible alias for `mine_palace_lock` so any external callers that imported the previous name keep working. 3) MemPalace#1 — hook_precompact swallowed OSError but not subprocess.TimeoutExpired `subprocess.run(..., timeout=60)` raises `TimeoutExpired` on slow palaces. The previous `except OSError` clause didn't catch it, so the hook could raise and fail to emit any JSON decision — leaving the harness without a block/passthrough signal. Fixed by catching `(OSError, subprocess.TimeoutExpired)` together and always falling through to the block decision so the hook reliably emits a response. 4) MemPalace#2 + MemPalace#4 — tests - tests/test_hooks_cli.py: added `test_precompact_first_two_attempts_block`, `test_precompact_passes_through_after_cap`, and `test_precompact_counter_is_per_session` to lock in the MemPalace#955 deadlock fix. - tests/test_palace_locks.py (new): covers `mine_palace_lock` single-acquire, reuse-after-release, cross-process serialization on the same palace, non-interference across different palaces, path normalization, and the `mine_global_lock` back-compat alias. 5) MemPalace#5 — known limitation, documented but not auto-fixed Copilot suggested detecting collections missing `hnsw:num_threads=1` and calling `collection.modify(metadata=...)` to retrofit existing palaces. Verified against chromadb 1.5.7: `modify(metadata=...)` replaces metadata rather than merging, and re-passing `hnsw:space="cosine"` then raises `ValueError: Changing the distance function of a collection once it is created is not supported currently.` The HNSW runtime configuration (`configuration_json`) also does not expose `num_threads` in chromadb 1.5.x, so the flag appears to be read only at creation time. Rather than paper over the limitation with a best-effort `modify` that silently drops `hnsw:space`, documented in the mcp_server comment that pre-existing palaces need a `mempalace nuke` + re-mine to gain the protection. Fresh palaces are always protected. Testing - pytest tests/test_palace_locks.py tests/test_hooks_cli.py tests/test_backends.py tests/test_cli.py → **98 passed, 0 failed**. - Runtime validation with two concurrent `mempalace mine` calls: - Different palaces → both complete in parallel ✓ - Same palace → one completes, the other exits with "another `mine` is already running against <palace> — exiting cleanly." ✓

Adds _try_gemini_json parser to normalize.py for three layouts: 1. Gemini API contents format (~/.gemini/sessions/*.json): {"contents": [{"role": "user", "parts": [{"text": "..."}]}, ...]} 2. Messages-wrapper variant: {"messages": [{"role": "user", ...}, {"role": "model", ...}]} 3. Flat top-level list with role="model". This complements the existing _try_gemini_jsonl parser (which handles ~/.gemini/tmp/<hash>/chats/session-*.jsonl with session_metadata sentinel) — JSONL covers Gemini CLI runtime sessions, JSON covers exported / Studio-saved transcripts. ## Review feedback addressed (PR MemPalace#204) bgauryy review: - MemPalace#1 Parser-precedence bug: _try_gemini_json runs *before* _try_claude_ai_json so the {"messages":[..., role=model, ...]} layout is no longer silently claimed by the Claude parser. The Gemini parser's has_model_role guard prevents false-positives against Claude / ChatGPT data. - MemPalace#2 Layout 2a coverage: TestGeminiJson.test_messages_wrapper_format + test_messages_wrapper_does_not_get_claimed_by_claude pin the fix in place. - MemPalace#3 Test conflicts with current main: rebased onto develop; tests restructured into TestGeminiJson class. - MemPalace#4 tempfile/os.unlink → pytest tmp_path everywhere. - MemPalace#5 elif not text → else (the elif branch was dead). - MemPalace#6 Module docstring updated to mention Google AI Studio. Tests: 9 new cases in TestGeminiJson covering all three layouts, multi-part text joining, non-text part skipping, has_model_role disambiguation, dispatch-chain regression for review MemPalace#1.

…tion Addresses PR MemPalace#548 review feedback about scan amplification on large palaces. The previous implementation made up to six ChromaDB scans per clean operation: 1. count_drawers(drawers_col) — scan MemPalace#1 2. count_drawers(compressed_col) — scan MemPalace#2 3. delete_drawers(drawers_col) — internal get(where=...) scan MemPalace#3 4. delete_drawers(drawers_col) — delete(where=...) scan MemPalace#4 5. delete_drawers(compressed_col) — scan MemPalace#5 6. delete_drawers(compressed_col) — scan MemPalace#6 Each call was doing its own metadata filter over the collection, meaning a 100K-drawer palace paid the filter cost six times for a single cleanup. This refactor drops it to exactly two scans — one per collection — regardless of palace size. Changes: * palace.py: introduce `find_drawer_ids(col, wing, room)` which returns the matching ID list in a single `get(where=..., include=[])` call. ChromaDB fetches only IDs — no documents, embeddings, or metadatas — so the scan is as cheap as ChromaDB can make it. * palace.py: `count_drawers` is now a thin wrapper around `find_drawer_ids`. The old standalone `delete_drawers` helper is removed because its count-then-delete pattern is exactly what we are trying to avoid. * cli.py::cmd_clean: call `find_drawer_ids` once per collection, reuse the returned ID lists for both the preview counts and the subsequent delete. Deletes go through `col.delete(ids=[...])` which is an O(n) primary-key delete, not another metadata scan. Empty lists are guarded to stay compatible with ChromaDB versions that reject `delete()` with no ids or where filter. Why two scans and not one: ChromaDB collections are independent — there is no cross-collection query. `mempalace_drawers` and `mempalace_compressed` must each be filtered separately. A single-scan variant would have to assume that compressed IDs are a strict subset of drawer IDs and delete compressed by the drawer IDs we already found, but `tool_delete_drawer` in mcp_server.py does not cascade to compressed, so real palaces can contain orphaned compressed rows whose drawer is already gone. Going to one scan would silently leak those orphans. Two scans is the minimum that preserves correctness. Tests: * New `test_find_drawer_ids_*` unit tests cover wing-only, wing+room, and no-match cases. * New `test_find_drawer_ids_single_scan` monkey-patches `collection.get` to assert exactly one call. * New `test_clean_scans_each_collection_exactly_once` is a regression test that wraps `chromadb.PersistentClient` and counts `where`-filtered `get()` calls per collection during a full `cmd_clean` invocation, failing if either collection is scanned more than once. * The existing 15 CLI black-box tests stay identical — the behavior is unchanged, only the scan count dropped. Full suite: 551 passed (was 549, +2 new perf regression tests). Also in this commit: `-V` / `--version` flags, because every good app needs a version flag and we somehow shipped three minor releases without one. The installed version is now embedded in the `-h` description line too, so `mempalace -h` answers "what am I running?" without a separate invocation.

…#1555 PR MemPalace#1555 (format coverage + virtual line numbering) merged with twelve inline polish comments from Copilot + gemini-code-assist that weren't load-bearing enough to block the original ship but are real cleanups. This PR addresses them. Twelve items in scope; one item (drawer ID delimiter — Copilot MemPalace#13) is deferred to its own dedicated PR because it's a breaking schema change that requires migration design beyond the scope of a polish PR. ## Behavioral fixes (5 items, RED-tested first) 1. **FileNotFoundError vs broken symlink (Copilot MemPalace#8).** ``extract_text`` previously mapped every ``FileNotFoundError`` from ``stat()`` to ``SKIP_BROKEN_SYMLINK``. That's misleading for the common case of a regular file deleted between scan and extract. Now distinguishes: ``SKIP_BROKEN_SYMLINK`` only when ``p.is_symlink()`` is true; ``SKIP_UNREADABLE`` otherwise. 2. **``file_already_mined`` extract_mode scoping (Copilot MemPalace#11, MemPalace#12).** Both call sites in ``mine_formats`` and ``_file_chunks_locked`` now pass ``extract_mode="format"``. Previously the format miner could falsely treat drawers from project / convo miner on the same source file as "already mined" (and vice versa). Scopes idempotency to the correct drawer subset. 3. **Sentinel skip for transient missing-dep statuses (Copilot MemPalace#14).** New ``_TRANSIENT_MISSING_DEP_STATUSES`` set + ``_register_skip_sentinel_if_appropriate`` helper. Skip variants like ``SKIP_NO_MARKITDOWN`` / ``SKIP_NO_STRIPRTF`` / ``SKIP_MISSING_FORMAT_DEPS`` / ``SKIP_NETWORK_TIMEOUT`` no longer write the "already-mined" sentinel. Otherwise installing the missing extra later wouldn't trigger a re-mine. 4. **Outer ``except Exception`` in ``mine_formats`` (Gemini MemPalace#5).** The outer try around the loop previously caught only ``KeyboardInterrupt``, leaving any setup-time error (e.g., ``scan_formats`` raising) to propagate as a bare traceback. Now catches ``Exception`` defensively, logs it, prints a partial-progress summary, and lets the ``finally`` PID-cleanup run. Mirrors miner.py's belt-and-suspenders pattern. 5. **Thread user's ``chunk_size`` / ``chunk_overlap`` / ``min_chunk_size`` through to ``chunk_text`` (Gemini MemPalace#3).** ``MempalaceConfig`` was loaded only to validate readability; users who tuned their config saw no effect in format-mode mining. Now properly threaded. ## Trivial cleanups (5 items) 6. **Path expanduser in ``extract_text`` (Copilot MemPalace#7).** ``Path(path)`` → ``Path(path).expanduser()`` so CLI inputs like ``~/docs/file.pdf`` resolve correctly. 7. **Path expanduser+resolve in ``scan_formats`` (Copilot MemPalace#9).** Same fix; ``~/docs`` and relative paths now work consistently. 8. **Use resolved ``format_path`` in ``mine_formats`` (Copilot MemPalace#10).** ``scan_formats(format_dir)`` → ``scan_formats(format_path)`` so the already-resolved path is used. 9. **``render_with_line_numbers`` type annotation (Copilot MemPalace#15).** ``text: "str | None"`` reflects the documented + tested ``None`` handling. 10. **Test + docs claims (Copilot MemPalace#16, MemPalace#17, MemPalace#18).** Stale framings removed: - ``docs/format-coverage.md`` — 14 fringe cases + "see the file for the current test inventory" (no more frozen test count). - ``tests/test_line_numbers.py`` — drops "proposed for mempalace 3.3.6" + "run from the proposal directory" references. - ``tests/test_format_miner.py`` — drops "MarkItDown is mocked throughout" (live integration tests exist) + proposal-directory framing. ## Module-level hoists (enables clean test patching) - ``MempalaceConfig`` (from ``.config``) hoisted from lazy local import to module-level so tests can patch ``mempalace.format_miner.MempalaceConfig``. - ``chunk_text`` (from ``.miner``) hoisted similarly. Both follow the pattern PR MemPalace#1565 used for ``compute_hallways_for_wing``. ## Complexity refactor Extracted ``_print_mine_summary`` from ``mine_formats`` so the orchestrator stays under the project's ``max-complexity = 25`` ceiling (per ``pyproject.toml [tool.ruff.lint.mccabe]``). Behavior unchanged; pure extraction. ## Out of scope (intentionally deferred) - **Drawer ID delimiter collision (Copilot MemPalace#13)** — ``f"{source_file}{chunk_index}"`` can theoretically collide (``"/path/a1" + "23"`` == ``"/path/a" + "123"``). Fixing this is a breaking schema change to drawer IDs and requires a migration plan; will land as its own PR after design. - The four bot comments that were ALREADY addressed by amendment MemPalace#3 before the PR MemPalace#1555 merge (``_SKIP_DIRS`` dedup, ``scan_formats`` symlink skip, ``source_mtime`` tracking, hall+entities metadata) — no action needed; verified during audit. ## Tests (RED-first) Six new RED-first tests in ``tests/test_format_miner.py``: test_extract_text_nonexistent_regular_file_returns_unreadable_not_broken_symlink test_mine_formats_passes_extract_mode_format_to_file_already_mined test_mine_formats_does_not_write_sentinel_for_skip_no_markitdown test_mine_formats_does_not_write_sentinel_for_skip_missing_format_deps test_mine_formats_catches_unexpected_exception_and_prints_summary test_mine_formats_threads_chunk_size_from_user_config All six RED before this commit (failures correctly identified the bugs they're targeting), all six GREEN after. One existing test (``test_mine_formats_continues_after_per_file_error``) updated to patch the new module-level binding ``mempalace.format_miner.chunk_text`` instead of the old ``mempalace.miner.chunk_text`` source location, and to accept the ``**kwargs`` the call now passes through. Behavior unchanged. ## Verification pytest -q (full mempalace suite) → 2065 passed, 3 skipped, 0 regressions ruff check mempalace/format_miner.py mempalace/searcher.py tests/ → All checks passed! ruff format --check ... → 4 files already formatted (pinned 0.15.9) mine_formats complexity → ≤ 25 (under the project ceiling)

Five fixes from the Gemini Code Assist review on MemPalace#1632 — three real bugs, two cleanups, all consistent with the bash-3.2-compatibility contract documented in the original commit. Bug fixes (high) ---------------- 1. hooks/cursor/lib/common.sh — config.json kill-switch check used a `python3 - <<'PYEOF' ... PYEOF` heredoc inside a `$(...)` command substitution. The heredoc body contains parens which trips the macOS bash 3.2.57 parser bug. Replaced with a `python -c '...'` call passing the config path as argv[1]. Matches the pattern already used in mempal_parse_stdin in the same file. 2. hooks/cursor/install.sh — a relative `--install-dir` was written verbatim into hooks.json. Cursor invokes hook commands from its own working directory (typically the project root), so a relative command path would silently fail to launch the hook. Now resolved to an absolute path against `$PWD` before being baked in. 3. hooks/cursor/mempal_save_hook_cursor.sh — `MEMPAL_SAVE_INTERVAL=0` would crash bash on `$((NEXT % 0))` (division by zero). Extended the existing sanitiser case to coerce 0 to the default interval alongside empty / non-numeric values. Cleanups (medium) ----------------- 4. hooks/cursor/install.sh — the EMPTY_CHECK_PY temp file is now inlined as `python -c '...'`. Removes a small leak window (tmpfile would linger if the script were interrupted between mktemp and rm -f) and shortens the script. 5. hooks/cursor/install.sh — `mktemp -t prefix` has subtly different semantics on BSD (macOS) vs GNU mktemp. Switched to the portable absolute-template form `mktemp "${TMPDIR:-/tmp}/...XXXXXX"` which behaves identically on both. Regression tests ---------------- - tests/test_cursor_hooks_shell.py test_save_interval_zero_is_coerced_to_default — guards fix MemPalace#3. - tests/test_cursor_hooks_install.py — new TestInstallDirAbsolutePath class: test_relative_install_dir_is_absolutized_in_hooks_json — guards fix MemPalace#2 against regression. test_absolute_install_dir_is_preserved_verbatim — guards that the relative-to-absolute resolution does not mangle paths that were already absolute. Verification ------------ - bash -n on all three edited scripts: clean. - uv run pytest tests/test_cursor_hooks_*.py tests/test_cursor_plugin_manifest.py: 132 passed (was 129; +3 regression tests). - uv run pytest tests/ --ignore=tests/benchmarks: 2399 passed, 3 skipped (pre-existing). - uv run ruff check . / ruff format --check .: clean. Co-authored-by: Cursor <cursoragent@cursor.com>

… env var (MemPalace#1489) - (CRITICAL MemPalace#1) get_collection now delegates to create_collection for creation path, respecting env var and sync_threshold parameter instead of bypassing with _HNSW_BLOAT_GUARD - (HIGH MemPalace#2) rebuild_index and legacy cmd_repair paths now accept and pass sync_threshold to _rebuild_collection_via_temp, threading it through all call sites - (MEDIUM MemPalace#3) unparseable MEMPALACE_HNSW_SYNC_THRESHOLD now logs warning, defaults to None

ZK-Snarky added 2 commits April 6, 2026 17:16

feat: Hermes memory provider integration

11b1996

fix: cache ChromaDB client to prevent SQLite lock races

b4cd763

ZK-Snarky mentioned this pull request Apr 7, 2026

feat: MemPalace memory provider (96.6% LongMemEval, local, free) NousResearch/hermes-agent#5671

Closed

Seb Taillieu added 2 commits April 6, 2026 23:43

fix: remove unused imports and rename ambiguous variable (ruff lint)

68286c7

fix: correct 12 bugs found in Codex audit (KG API, prefetch shape, th…

b3e1204

…read safety, diary safety)

bgauryy mentioned this pull request Apr 7, 2026

feat: improve architecture, remove redundacies and duplicaions, centralize ChromaDB into palace_db singleton, consolidate MCP tools 19→14, add test suite #25

Closed

2 tasks

igorls mentioned this pull request Apr 7, 2026

fix: add limit=10000 safety cap to all unbounded ChromaDB .get() calls #137

Merged

bensig mentioned this pull request Apr 7, 2026

Integrate Burgess Principle and Hermes Agent memory provider #93

Closed

3 tasks

bensig closed this Apr 7, 2026

felipetruman mentioned this pull request Apr 17, 2026

fix: HNSW graph corruption, PreCompact deadlock, mine fan-out (closes #974, #965, #955) #976

Merged

5 tasks

felipetruman mentioned this pull request Apr 17, 2026

feat(hooks): tune stop-hook cadence via env vars (MEMPAL_SAVE_INTERVAL / MEMPAL_STOP_HOOK_DISABLE) #985

Open

rergards pushed a commit to rergards/mempalace that referenced this pull request Apr 20, 2026

Add Kotlin support to entity detection fallback (MemPalace#3)

c66e405

jphein mentioned this pull request Apr 24, 2026

followup(3.3.4): three small bugs found during v3.3.3 smoke test #1145

Closed

igorls mentioned this pull request Apr 27, 2026

fix(hooks): always mine the active transcript as convos, additive to MEMPAL_DIR #1231

Merged

6 tasks

jphein mentioned this pull request Apr 30, 2026

feat: Multipass -- multi-hop paths through the Mem Palace #101

Open

igorls mentioned this pull request May 11, 2026

fix(benchmarks): auto-pull embedding model when missing instead of silently scoring 0 #1448

Closed

4 tasks

milla-jovovich mentioned this pull request May 21, 2026

fix(extract): polish PR — address bot review feedback on PR #1555 #1572

Merged

xg-gh-25 mentioned this pull request May 22, 2026

HNSW corruption from concurrent ChromaDB clients (MCP server + hook subprocesses + auto-ingest mine) within single Claude Code session #1581

Open

igorls mentioned this pull request May 22, 2026

feat(cli): add mempalace read — consume closet pointer, return surgical slice #1588

Open

5 tasks

undeadindustries mentioned this pull request May 27, 2026

feat: add Cursor IDE support (hooks, plugin, skill, docs, tests) #1632

Open

5 tasks

trek-e mentioned this pull request May 29, 2026

Mines smaller than hnsw:sync_threshold never persist index_metadata.pickle #1579

Closed

raman325 mentioned this pull request Jun 3, 2026

feat(integrations): Hermes memory provider (revives #3, ABC-current + dim-fix) #1684

Open

jphein mentioned this pull request Jun 7, 2026

docs: RFC 001 — storage backend plugin specification #743

Open

5 tasks

This was referenced Jun 8, 2026

Bug: Benchmark/test data loaded into user wing pollutes semantic search #1722

Open

feat(mcp): add mempalace_delete_by_source bulk-cleanup tool (#1722) #1729

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Hermes memory provider integration#3

feat: Hermes memory provider integration#3
ZK-Snarky wants to merge 4 commits into
MemPalace:mainfrom
ZK-Snarky:feat/hermes-integration

ZK-Snarky commented Apr 7, 2026

Uh oh!

bensig commented Apr 7, 2026

Uh oh!

ZK-Snarky commented Apr 7, 2026

Uh oh!

bensig commented Apr 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ZK-Snarky commented Apr 7, 2026

Uh oh!

bensig commented Apr 7, 2026

Uh oh!

ZK-Snarky commented Apr 7, 2026

Uh oh!

bensig commented Apr 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants