fix(cli): paginate miner.status() to remove 10K drawer truncation#1
Merged
jphein merged 1 commit intoApr 10, 2026
Conversation
The CLI `mempalace status` command still capped at 10,000 drawers after this PR's MCP server fix, because miner.status() retained the original `col.get(limit=10000, include=["metadatas"])` call. On palaces larger than 10K, it printed a wrong total and silently dropped every wing past the cutoff (tested on a 14,902-drawer / 17-wing palace: CLI reported "10000 drawers" and listed only 11 wings). Replace the single bounded call with the same paginated offset loop the new `_fetch_all_metadata()` helper uses in mcp_server.py (1,000-item batches), and report the true total via `col.count()`. Closes the CLI half of MemPalace#478.
4 tasks
jphein
pushed a commit
that referenced
this pull request
Apr 10, 2026
…silon, add schema bounds - Paginate miner.status() past 10K drawer limit (fixes PR #1 from psaghelyi) - Use math.isclose(abs_tol=0.001) instead of abs() < 0.01 for mtime dedup - Add minimum/maximum bounds to min_similarity in MCP search tool schema - Add debug logging to bare except in file_already_mined() https://claude.ai/code/session_01LAi5NQmr4KKyx6QNwZrRXq
jphein
added a commit
that referenced
this pull request
Apr 11, 2026
Hybrid search (TODO #1): when vector results are poor (best distance > 1.0), automatically falls back to keyword text-match via ChromaDB where_document.$contains. Extracts most distinctive non-stopword token from query, or accepts explicit keyword param. Results merged, deduped, sorted by distance. MCP server exposes new keyword parameter. Wing fix (TODO #0): _ingest_transcript() now derives project wing from Claude Code transcript path (-Projects-<name>/) instead of hardcoding "sessions". Per-project search now finds auto-mined content. 692 tests pass (22 new). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
jphein
added a commit
that referenced
this pull request
Apr 11, 2026
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
jphein
added a commit
that referenced
this pull request
Apr 19, 2026
Two items in "Fork Changes (still ahead of upstream after v3.3.1 merge)" were never — or are no longer — fork-only. Demote both: 1. Epsilon mtime comparison (palace.py) Upstream merged Arnold Wender's equivalent fix as PR MemPalace#610 on 2026-04-12 (commit bb7ed80). Their threshold is 0.001 vs our fork's 0.01, but abs(stored - current) < epsilon is semantically identical. Moved to "Merged into upstream (post-v3.3.1)". 2. ".jsonl exempt from JUNK_FILE_SIZE cap" The description was wrong. The actual change (commit 560fdbd) adds ".jsonl" to READABLE_EXTENSIONS in miner.py — a whitelist addition, not a size-cap exemption. And it was authored by MSL (upstream maintainer) at the same SHA on upstream/develop. Never was a fork contribution. Moved to "Pulled in from upstream/develop". Related: upstream also raised MAX_FILE_SIZE 10MB → 500MB in d137d12 (the actual size-cap fix, separate concern). Clarified that item now at #1 (bulk_check_mined) is fork-only and independent of the mtime comparison fix. Renumbered remaining "still ahead" items 1-18. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
jphein
added a commit
that referenced
this pull request
Apr 21, 2026
Resolutions: - `.claude-plugin/.mcp.json`, `plugin.json` — adopt upstream's `mempalace-mcp` console-script command (added via upstream #340 for pipx/uv). Run `pip install -e .` in plugin venv after merge to install the entry point. - `.claude-plugin/hooks/*.sh`, `hooks/*.sh` — adopt upstream's console-command resolution order (`mempalace` script → `python3 -m mempalace` → `python`). `MEMPAL_PYTHON` override still works inside `hooks_cli.py`. - `mempalace/hooks_cli.py`, `tests/test_hooks_cli.py` — keep fork's `_mempalace_python()` helper (fork-ahead #4); upstream only had `sys.executable`, which loses MEMPAL_PYTHON override. - `mempalace/miner.py` — keep fork's concurrent mining path (fork-ahead #1), apply upstream's unicode-`✓` → ASCII-`+` fix (MemPalace#681) to both paths. - `mempalace/backends/chroma.py` — take upstream's refined `quarantine_stale_hnsw` docstring (it's the version merged via our own MemPalace#1000). Brought in: 33 upstream commits including Belarusian/Chinese/German/Spanish/French entity detection, console-script entry points, hook plugin-root space quoting, and v3.3.2 tag (which contains our MemPalace#681/MemPalace#1000/MemPalace#1023). Tests: 1096 passed, 106 deselected (benchmarks). Ruff clean.
jphein
added a commit
that referenced
this pull request
Apr 24, 2026
…ostgres Three things merged into one README pass: 1. Badge: link version-3.3.4 to jphein/mempalace/releases (the v3.3.4 tag we just pushed) and add an upstream-3.3.3 secondary badge so readers can tell fork vs upstream version at a glance. Was sitting uncommitted from earlier today. 2. Multi-client coordination section: replaced the three-fix v3.3.4 summary with a four-fix one. Added @felipetruman's MemPalace#976 num_threads pin (cherry-picked at 552a0d7) as fix #1 — the actual root-cause fix. Reframed our MemPalace#1171/MemPalace#1173/MemPalace#1177 as defense-in-depth around symptoms. Walked back palace-daemon from "primary concurrency story in progress" to "deferred pending observation" — with MemPalace#976's fix in place, the daemon's same-machine value drops; multi-machine and Windows remain its differentiators but neither is current pain. 3. Postgres + pgvector: walked back from "parallel track" framing to "long-term option, no immediate move" for the same reason. Migration cost stays real, current pain is mitigated, decision deferred until v3.3.4 stack is observed in production or TS rewrite ships. Removed two stale paragraphs that were left over from the previous "daemon as primary" framing.
jphein
pushed a commit
that referenced
this pull request
May 3, 2026
The MCP `mempalace_get_drawer` tool returned the entire raw drawer metadata blob to any connected client, and the `source_file` field in that blob is the absolute filesystem path written by the miners (`miner.py`, `convo_miner.py` — `source_file = str(filepath)`). On a single-user local deployment this is self-disclosure, but in nested-agent or multi-server MCP topologies the client is a separate trust domain and the host's directory layout has no documented client-side use. Mirror the mitigation that `searcher.search_memories()` already applies on its own return path: reduce `source_file` to its basename via `Path(source_file).name` before handing the metadata to the client. Citations still work — the directory layout does not leak. Companion to #1 (omit palace_path from tool_status). Same threat class, different surface: - mempalace_status — palace dir path → fixed in #1 - mempalace_get_drawer — per-drawer source_file path → this PR Other read tools were audited and do not leak host paths: - mempalace_search — already basenames source_file - mempalace_list_drawers — returns wing/room/preview only - mempalace_diary_read — date/timestamp/topic/content only - mempalace_reconnect — success/message/drawers only - mempalace_kg_* — entity/predicate strings, counts - mempalace_check_duplicate — wing/room/preview only Changes: - mempalace/mcp_server.py: tool_get_drawer() now basenames metadata.source_file - tests/test_mcp_server.py: regression test asserting the absolute path and its parent directory do not appear anywhere in the response - website/reference/mcp-tools.md: clarify the documented return shape
jphein
added a commit
that referenced
this pull request
May 13, 2026
…write tests (closes #68) Two real bugs in one fix: 1. The .sh wrapper rewrite (eaf0e2c, 2026-05-13) hardcoded `HOOK_PY=/home/jp/Projects/palace-daemon/clients/hook.py`. That path only exists on JP's laptop — wrapper silently exits 0 on every other host. Wrappers couldn't work on disks (where palace-daemon lives at /home/jp/.local/share/palace-daemon/), on CI runners (no checkout at all), or for anyone else cloning the fork. 2. The corresponding test_claude_plugin_hook_wrappers.py asserts the OLD wrapper contract (mempalace CLI fallbacks, python -m mempalace fallback chain). With the new pass-through wrapper it asserts behavior that no longer exists → 10 tests fail on every main push since eaf0e2c. CI red since 2026-05-13T13:20 UTC. Fix for #1: HOOK_PY="${PALACE_DAEMON_HOOK_PY:-/home/jp/Projects/palace-daemon/clients/hook.py}" Env-var override; default unchanged for JP's machine. disks deployment can set PALACE_DAEMON_HOOK_PY=/home/jp/.local/share/palace-daemon/... CI / test fixtures can point at any path. Fix for #2: Replaced the 193-line OLD-contract test file with a 130-line file asserting the NEW contract (3 tests × 2 scripts): - execs python3 with --hook <name> --harness claude-code when HOOK_PY exists - exit 0 silent no-op when HOOK_PY missing - trailing args pass through via "$@" 6/6 pass. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
6 tasks
jphein
added a commit
that referenced
this pull request
May 22, 2026
…daemon-routed integration recipe (#106) * feat(sources): OpenCode adapter on RFC 002 contract Adds mempalace/sources/opencode.py — an OpenCodeSourceAdapter subclass of BaseSourceAdapter that ingests OpenCode AI-coding-CLI session transcripts from OpenCode's local SQLite store (~/.local/share/opencode/opencode.db) into the palace as DrawerRecords formatted to match convo_miner's exchange-pair shape. The adapter: * Yields SourceItemMetadata then DrawerRecords per session. * Each session becomes one source_file shaped as opencode://<absolute-db-path>#session=<sid>; chunks are chunked_content exchange-pair drawers. * Declares 8 transformations (6 opencode-namespaced + 2 reserved); every name resolves to a reference implementation on mempalace.sources.transforms per RFC 002 §7.3. * Implements is_current honoring opencode_session_version when present, falling back to "metadata exists → assume current" for append-only safety on older drawers. * Routes wing from session.directory basename (or explicit options['wing'] override); room from detect_convo_room on the rendered transcript; hall from convo_miner._detect_hall_cached. * Stamps universal §5.1 metadata (wing, room, hall, filed_at, added_by, ingest_mode, extract_mode, privacy_class) plus the declared per-adapter schema (session_id, session_title, project_dir, session_created_at, message_count, opencode_db_path). * default_privacy_class = "pii_potential" — AI sessions leak everything; users opt in explicitly to laxer floors. mempalace/sources/transforms.py: adds 6 opencode-namespaced transformations (extract_text_parts, skip_tool_echo, skip_file_injection, role_coerce, same_role_merge, format_exchange). Each operates on the role-tab-prefixed line stream the adapter's canonical_source_bytes produces; declared in declaration order so the conformance round-trip test reproduces drawer content exactly. pyproject.toml: registers the adapter under the [project.entry-points."mempalace.sources"] group as opencode = "mempalace.sources.opencode:OpenCodeSourceAdapter". tests/test_sources_opencode.py: 28 tests covering * class identity, capabilities, schema shape * SourceNotFoundError on missing DB / missing tables * AdapterClosedError after close() * source_summary item count + missing-DB path * ingest yields metadata then drawers per session * cancelled / single-turn sessions skipped * universal + schema metadata fields on every drawer (flat-scalar) * RouteHint carries wing + room * wing routing groups by session.directory * explicit options['wing'] wins over directory derivation * skip_current_item short-circuits drawer emit per RFC 002 §1.2 * is_current with/without opencode_session_version * tool-input / tool-output / tool-echo / file-injection parts are stripped from drawer content * declared-transformation round-trip reproduces chunk content (RFC 002 §7.3) * empty DB, single-message session edge cases * Unicode (BMP + non-BMP) preserved through transcript * registry resolves the adapter when registered explicitly * byte_preserving capability is NOT advertised (declared-lossy) tests/fixtures/opencode/sample_session_2026_05_12/: builder script and README documenting the live opencode-ai 1.14.39 schema captured verbatim from JP's local install on 2026-05-12. No recorded .db ships (real-session content is unsanitizable user-private data); build_fixture.py reproduces the schema and populates it with synthetic-but-realistic exchanges the tests consume. tests/test_corpus_origin_integration.py: extends the §-section allowlist to include the new test file (existing allowlist already covers mempalace/sources/). Reverse-engineering credit: the OpenCode SQLite schema, json_extract paths, tool-echo / file-injection skip filters, and same-role merge originated in @JakobSachs's PR #23 (feat: add OpenCode SQLite session database support, base=develop). This adapter rebuilds those primitives on the RFC 002 contract so OpenCode support can ship as a registered adapter rather than as a normalize.py branch — see #23 coordination thread. Test suite: 1876 passed, 7 skipped, 106 deselected (28 new opencode tests, no regressions). Co-authored-by: Jakob Sachs <28728963+JakobSachs@users.noreply.github.com> Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(sources/opencode): address Gemini Code Assist review on MemPalace#1484 Four issues raised in the automated review (2026-05-13T01:40Z): 1. **opencode_session_version missing from metadata** (high) `is_current()` at opencode.py:391 compares `existing_metadata.get( "opencode_session_version")` against the new `SourceItemMetadata.version`. Without the metadata key being written on first ingest, the comparison always falls back to "exists → current" and incremental ingest can never detect updates to existing sessions. Now populated as `str(time_updated or time_created or 0)` — same value as the version yielded in SourceItemMetadata above. 2. **PalaceContext._skip_requested encapsulation violation** (medium) The adapter was reading and writing the private flag directly. Added `PalaceContext.is_skip_requested()` public method (read-only) so adapters can short-circuit expensive work (SQL query, transcript build, chunking) when core has signaled skip. Core still owns the reset — adapters MUST NOT clear it, per the new docstring. This is a small companion change to the upstream RFC 002 scaffolding (MemPalace#1014); justified because the spec's "core checks between yields" pattern doesn't hold for Python generators (the adapter's code runs between yields, not core's). The check needs to be available to the adapter. 3. **filed_at generated inside chunk loop** (medium) For consistency across chunks of the same session, `filed_at` is now computed once per session and reused for every chunk's metadata. Also pre-computes `session_version` for the same reason. 4. **PEP 8 import placement** (medium) `import json as _json` was mid-file in transforms.py; hoisted to the top with the other imports. Also removed an unused `import json` from opencode.py that ruff caught. Tests: 57 pass (28 opencode + 29 base sources); ruff clean on all three modified files. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(sources/opencode): address @igorls review on MemPalace#1484 Three blockers + one minor cleanup from the maintainer review at 2026-05-13T02:52Z: 1. **ruff F401 — unused `os` import** in tests/test_sources_opencode.py:17 Dropped. No call sites used it. 2. **ruff E402 — module-level import not at top** in tests The `sys.path.insert(0, FIXTURE_DIR); import build_fixture` pattern tripped E402 (the `# noqa: E402` was suppressing a legitimate complaint). Refactored to `importlib.util.spec_from_file_location` + `module_from_spec` per @igorls's suggestion — keeps the fixture loader at top of file with the other imports, no sys.path mutation at module scope. Also registers the loaded module in `sys.modules` so `dataclasses` and typing introspection inside the fixture builder can resolve `cls.__module__` correctly. 3. **Route-hint wing mismatch** (RFC 002 §2.5 violation) `_route_hint_for()` (lazy-fetch SourceItemMetadata stage) computed wing from `directory` only; `_wing_for()` (eager DrawerRecord stage) honored `source.options["wing"]` first. When a user passed `options={"wing": "Custom Wing"}`, the metadata hint said `"<dirname>"` while the actual drawers said `"custom_wing"` — core could make wrong skip/routing decisions on the gap. Fix: `_route_hint_for(source, directory)` now delegates to `_wing_for` so both stages apply identical precedence. 4. **Unjustified `# noqa: F401` on `AuthRequiredError`** (minor) The import claimed re-export "used in docstrings" but `__all__` only exposes `OpenCodeSourceAdapter` + `session_source_file`. Dropped the import + the noqa. Tests: 57 pass (28 opencode + 29 base sources); ruff clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * style(sources/opencode): ruff format with CI's ruff 0.4.x CI's lint job ran on commit 13353d9 and failed `ruff format --check .` even though local `ruff format --check` was clean. Cause: ruff version mismatch — CI installs `>=0.4.0,<0.5` (per ci.yml lint job), local env has ruff 0.15.12. Different major versions format differently; 0.15-formatted source isn't 0.4.x-format-clean. Reformatted `mempalace/sources/opencode.py` and `tests/test_sources_opencode.py` with `uvx --from "ruff>=0.4.0,<0.5" ruff format` so CI's check passes. Changes are whitespace-only — no semantic diff. Tests still pass 28/28. Lint clean under 0.4.x. The 29 other files that local ruff 0.15.12 wants to reformat are upstream's own files and pass upstream's CI as-is; left untouched. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * style(tests/fixtures/opencode): ruff format build_fixture.py with 0.4.x Missed in the previous format pass (f94e3fe) — only touched the two top-level files. CI's `ruff format --check .` scans the whole tree and caught it. Whitespace-only changes. * feat: add OpenCode MCP integration for MemPalace * fix: use python -m mempalace.mcp_server for robustness * docs(integrations): OpenCode integration recipe + cherry-pick fork-changes entries Adds the three-direction OpenCode + MemPalace integration recipe: - ``docs/integrations/opencode.md`` — full setup guide covering the read (MCP), push (live-capture plugin), and pull (retrospective backfill) paths for daemon-routed deployments. - ``examples/opencode/opencode.jsonc.example`` — copy-paste user config pointing at the palace-daemon wrapper. - ``examples/opencode/option-k-plugin-daemon-routing.patch`` — a re-applicable diff for option-K's ``opencode-plugin-mempalace`` v1.2.1 issue #1 (isInitialized passes ``--palace`` which bypasses ``PALACE_DAEMON_URL`` routing). Also adds two fork-changes.yaml entries for the cherry-picked upstream PRs already in this branch: - ``opencode-mcp-config-cherry-pick-1567`` (commit ba16b82) - ``opencode-source-adapter-cherry-pick-1484`` (commit 2ffe652) The recipe's own fork-changes.yaml entry is added in the next commit once this commit's SHA is known (avoids the self-referencing-commit anti-pattern flagged in the worktree handoff). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(changelog): add opencode-integration-recipe entry pointing at 60dc9e6 Companion to 60dc9e6 (the OpenCode integration recipe commit). Split out per the worktree handoff to avoid the self-referencing-commit-SHA anti-pattern: the YAML entry now points at the prior docs commit, not at itself. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(opencode-integration): bundled live-capture plugin + split option-K patches The previously combined option-K patch (`option-k-plugin-daemon-routing.patch`) mixed two unrelated fixes against two different files and was failing `patch --dry-run` once Fix 1 was applied. Split into: - `option-k-plugin-daemon-routing.patch` — Fix 1 only (mempalace-cli.js, isInitialized daemon detection, option-K#1). - `option-k-plugin-message-updated.patch` — Fix 2 (index.js, subscribe to `message.updated` instead of the non-existent `chat.message`, filed upstream as option-K#4). End-to-end testing with both patches applied surfaced a third bug (option-K#5): the plugin's `mempalace mine <dir>` call hits the daemon, which evaluates `<dir>` against ITS OWN filesystem. For remote-daemon setups (palace-daemon on a different host from OpenCode) the path doesn't exist on the daemon's filesystem and the call returns 400. The option-K plugin is architecturally incompatible with multi-host deployments. Ships a self-contained replacement at `examples/opencode/live-capture/`: - `mempalace-live-capture.js` — minimal OpenCode plugin that subscribes to session.idle / session.deleted / session.status[idle] and spawns the Python helper. Detached subprocess, debounced per session, logs to ~/.local/share/opencode/mempalace-live-capture.log. - `capture-session.py` — Python helper that reads OpenCode's local SQLite session DB, extracts the role-pair transcript via the in-tree `OpenCodeSourceAdapter` helpers, and POSTs to the daemon's `/silent-save` endpoint. Stdlib-only, no extra pip deps. Verified end-to-end against the canonical daemon at disks.jphe.in:8085: a fresh opencode session ends with the transcript landing in wing_opencode_<basename>/room=diary, retrievable via mempalace_search. `docs/integrations/opencode.md` now documents both deployment paths (bundled plugin for remote-daemon, option-K + patches for local palaces) and explicitly notes that `experimental.chat.system.transform` does not exist in the OpenCode plugin API (so per-turn system-prompt injection is not available; agents recall memories via explicit MCP tool calls). Filed: - option-K/opencode-plugin-mempalace#4 - option-K/opencode-plugin-mempalace#5 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(changelog): add commit ref for opencode-live-capture-plugin entry Closes the YAML→render loop: scripts/check-docs.sh now verifies the commit hash resolves and FORK_CHANGELOG.md matches the manifest. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Jakob Sachs <28728963+JakobSachs@users.noreply.github.com> Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> Co-authored-by: Dxrk System <dxrk@local>
This was referenced May 22, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Follow-up to the MemPalace#493 MCP fix — closes the CLI half of milla-jovovich/mempalace#478.
Context
@web3guru888 asked me to roll the CLI fix into this branch before
feat/mcp-hooks-exportlands, after I reported that MemPalace#493 leftminer.status()still capped at 10,000 drawers. Detailed test report from a real 14,902-drawer palace: MemPalace#493 (comment)Change
miner.py:850-876— replace the singlecol.get(limit=10000, include=["metadatas"])call with the same paginated offset loop the new_fetch_all_metadata()helper uses inmcp_server.py:col.count()len(metas), which topped out at 10,000)Same fix pattern as
palace_graph.py:49-51and the server-side helper in this PR, so the whole codebase now has consistent pagination semantics on ChromaDB reads.Test plan
Verified against a real 14,902-drawer / 17-wing palace on mempalace 3.1.0 + chromadb 0.6.3:
Before (this PR's parent
feat/mcp-hooks-export@ 548abd6):After (this PR):
No new tests added — the change is mechanical and matches the existing
_fetch_all_metadata()pattern already covered by MemPalace#493's MCP tests.