Skip to content

feat(searcher): candidate_strategy="union" — BM25 candidates joined with vector pool before hybrid rerank#1306

Merged
igorls merged 2 commits into
developfrom
feat/hybrid-candidate-union
May 3, 2026
Merged

feat(searcher): candidate_strategy="union" — BM25 candidates joined with vector pool before hybrid rerank#1306
igorls merged 2 commits into
developfrom
feat/hybrid-candidate-union

Conversation

@igorls

@igorls igorls commented May 2, 2026

Copy link
Copy Markdown
Member

Summary

Adds an opt-in candidate_strategy="union" to search_memories that pulls top-K BM25-only candidates from sqlite FTS5 and merges them into the hybrid rerank pool. Default behavior ("vector") is unchanged.

The current hybrid path gathers candidates from the vector index only (n_results * 3 over-fetch), then BM25-reranks within them. When the query embeds close to the wrong content semantically, the right doc never enters the rerank pool — no matter how wide the over-fetch. Validated against a ~6K-document mixed corpus (knowledge prose + short structured records): at 30x over-fetch (~5% of the corpus), the target doc still didn't surface for narrative-shaped queries targeting terminology guides. Wider over-fetch isn't the answer; widening the pool's source is.

The concrete failure mode: a narrative-shaped query embeds close to records sharing the same operational vocabulary. A terminology / style guide is BM25-strong for the query but vector-distant. Vector-only candidates don't include it; BM25 never gets to rerank it. The hybrid path produces 0.00 recall on a probe that pure BM25 alone scores 1.00 — the hybrid is worse than its component on the same input.

What changed

  • New parameter candidate_strategy: str = "vector" on search_memories.
    • "vector" (default): historical behavior, no change.
    • "union": also fetch top n_results * 3 candidates via the existing _bm25_only_via_sqlite helper, dedupe by source_file, merge into the rerank pool. BM25-only candidates carry distance=None so they're scored on BM25 contribution alone.
  • _hybrid_rank now handles distance=None explicitly (vec_sim=0) instead of relying on a sentinel value.
  • New strategies register via _CANDIDATE_MERGERS; dispatch is in _apply_candidate_strategy so search_memories stays under the project's C901 complexity ceiling.

Bench numbers

~6K-doc internal mixed corpus, recall@10, 5 probes spanning policy-exception lookup, temporal-decay, style retrieval, set-difference, and pattern-recognition:

probe baseline ("vector") "union" delta
policy-exception 0.00 0.50 +0.50
temporal-decay 0.17 0.50 +0.33
style-retrieval 0.00 1.00 +1.00 (PASSES)
set-difference 0.00–0.06 0.06–0.09 ~
pattern-recog 0.64 (stable) 0.50–0.71 (typ. 0.71) typ. +0.07, occ. −0.14
macro recall 0.16–0.17 0.51–0.56 +0.34 to +0.40

The pattern-recog variance points at a related issue worth its own PR: _hybrid_rank computes BM25 IDF over the candidate set, so adding new candidates re-normalizes BM25 for existing candidates non-monotonically. Stable corpus-wide BM25 would remove this. Out of scope here.

Tests

tests/test_hybrid_candidate_union.py — 6 new tests, all passing:

  • default behavior unchanged (explicit "vector" matches default)
  • "union" surfaces a BM25-strong vector-distant doc
  • "union" doesn't drop docs "vector" would have found
  • empty-palace handling
  • invalid candidate_strategy raises ValueError
  • _hybrid_rank tolerates distance=None

Existing test_hybrid_search.py (5) and test_searcher.py (27) pass — no regressions.

Performance note

Each "union" query adds one sqlite open + FTS5 MATCH + metadata fetch (via the existing _bm25_only_via_sqlite helper, which already runs as the vector_disabled fallback path so the code is well-trodden). Per-query overhead is small but unmeasured at corpus scale. Default stays "vector" until a maintainer characterizes the cost.

Caveats / followups not in this PR

  1. Performance impact on large palaces — needs measurement before flipping the default.
  2. The local-BM25 IDF instability called out in the bench numbers above (_hybrid_rank re-normalizes existing candidates' BM25 when new ones are added) is its own architectural concern. Worth fixing in a separate PR.
  3. MCP surface — exposing candidate_strategy to MCP callers is straightforward (one extra arg) but I haven't done it here. Easy to add once the Python API lands.

Test plan

  • pytest tests/test_hybrid_candidate_union.py — 6/6
  • pytest tests/test_hybrid_search.py — 5/5 no regression
  • pytest tests/test_searcher.py — 27/27 no regression
  • ruff check clean
  • ruff format clean

…ing pool

Default search behavior is unchanged. Opt-in candidate_strategy="union"
also pulls top-K BM25-only candidates from sqlite FTS5 and merges them
into the rerank pool, catching docs with strong BM25 signal that the
vector index didn't surface in the over-fetch window.

Motivation
----------
The current hybrid path gathers candidates from the vector index only
(n_results * 3 over-fetch), then BM25-reranks within them. When the
query embeds close to the wrong content semantically, the right doc
never enters the rerank pool — *no matter how wide the over-fetch*.
Tested on a ~6K-document mixed corpus (knowledge prose + short structured
records): at *30x* over-fetch (~5% of the corpus) the target doc still
didn't surface for narrative-shaped queries targeting terminology guides.
Wider over-fetch isn't the answer; widening the pool's *source* is.

Concrete failure mode: a narrative-shaped query embeds close to records
sharing the same operational vocabulary (other narrative entries in the
corpus). A terminology / style guide is BM25-strong for the query
(rare keywords the guide repeats) but vector-distant. Vector-only
candidates don't include it; BM25 never gets to rerank it. The hybrid
path produces 0.00 recall on a probe that pure BM25 alone scores 1.00 —
the hybrid is worse than its component on the same input.

Behavior change
---------------
* New parameter ``candidate_strategy: str = "vector"`` on ``search_memories``.
  - ``"vector"`` (default): historical behavior, no change.
  - ``"union"``: also fetch top ``n_results * 3`` candidates via the
    existing ``_bm25_only_via_sqlite`` helper, dedupe by source_file,
    merge into the rerank pool. BM25-only candidates carry
    ``distance=None`` so they're scored on BM25 contribution alone
    (vec_sim coerces to 0).
* ``_hybrid_rank`` now handles ``distance=None`` explicitly, scoring
  such candidates as vector-unknown (vec_sim=0) rather than treating
  it as max-distance via shim.
* New strategies register via ``_CANDIDATE_MERGERS``; dispatch is in
  ``_apply_candidate_strategy`` so ``search_memories`` stays under the
  C901 complexity ceiling.

Bench numbers (~6K-doc internal mixed corpus, recall@10, 5 probes spanning
policy-exception lookup, temporal-decay, style retrieval, set-difference,
and pattern-recognition):

                              baseline ("vector")   "union"
  policy-exception probe        0.00                  0.50    +0.50
  temporal-decay probe          0.17                  0.50    +0.33
  style-retrieval probe         0.00                  1.00    +1.00 (PASSES)
  set-difference probe          0.00–0.06             0.06–0.09  ~
  pattern-recog probe           0.64 (stable)         0.50–0.71  variance, typ. +0.07
  macro recall                  0.16–0.17             0.51–0.56  +0.34 to +0.40

The pattern-recog variance points at a related issue worth a separate PR:
``_hybrid_rank`` computes BM25 IDF over the candidate set. Adding new
candidates re-normalizes BM25 for *existing* candidates non-monotonically.
Stable corpus-wide BM25 would remove this. Out of scope here.

Tests
-----
``tests/test_hybrid_candidate_union.py`` (6 tests, all pass):
- default behavior unchanged (explicit ``"vector"`` matches default)
- ``"union"`` surfaces a BM25-strong vector-distant doc
- ``"union"`` doesn't drop docs ``"vector"`` would have found
- empty-palace handling
- invalid ``candidate_strategy`` raises
- ``_hybrid_rank`` tolerates ``distance=None``

Existing ``test_hybrid_search.py`` (5) and ``test_searcher.py`` (27) pass.

Performance note
----------------
Each ``"union"`` query adds one sqlite open + FTS5 MATCH + metadata
fetch (via the existing ``_bm25_only_via_sqlite`` helper, which already
runs as the ``vector_disabled`` fallback path so the code is
well-trodden). Per-query overhead is small but unmeasured at corpus
scale. Default stays ``"vector"`` until a maintainer characterizes the
cost.
Copilot AI review requested due to automatic review settings May 2, 2026 03:50

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds an opt-in candidate_strategy="union" mode to search_memories, aiming to improve hybrid retrieval by merging BM25/FTS5 candidates into the rerank pool instead of relying on vector candidates alone. It extends the searcher’s hybrid-ranking path and adds targeted tests for the new strategy and distance=None handling.

Changes:

  • Add candidate_strategy to search_memories with "vector" and "union" dispatch paths.
  • Allow _hybrid_rank to handle BM25-only candidates by treating distance=None as no vector signal.
  • Add a new test module covering default behavior, union retrieval, invalid strategies, empty palaces, and missing-distance reranking.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 7 comments.

File Description
mempalace/searcher.py Adds union-candidate merging, strategy dispatch, API docs, and distance=None handling in hybrid reranking.
tests/test_hybrid_candidate_union.py Adds regression/integration tests for the new union candidate strategy and BM25-only hybrid reranking behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread mempalace/searcher.py Outdated
Comment on lines +587 to +591
seen_sources = {h.get("source_file") for h in hits}
for bh in bm25_extra:
key = bh.get("source_file")
if not key or key == "?" or key in seen_sources:
continue
Comment thread mempalace/searcher.py Outdated
Comment on lines +587 to +591
seen_sources = {h.get("source_file") for h in hits}
for bh in bm25_extra:
key = bh.get("source_file")
if not key or key == "?" or key in seen_sources:
continue
Comment thread mempalace/searcher.py
n_results: int = 5,
max_distance: float = 0.0,
vector_disabled: bool = False,
candidate_strategy: str = "vector",
# The invariant is: union should not lose anything vector found.
missing = vec_ids - union_ids
assert not missing, f"union dropped docs that vector found: {missing}"

get_collection(palace, create=True) # create empty collection
result = search_memories("anything", palace, n_results=5, candidate_strategy="union")
assert result.get("results", []) == []

Comment thread mempalace/searcher.py Outdated
Comment on lines +848 to +851
# Candidate strategy hook: optionally widen the rerank pool's *source*
# before ranking. Default ("vector") is a no-op; "union" merges top-K
# BM25 candidates from sqlite. See `_apply_candidate_strategy`.
_apply_candidate_strategy(candidate_strategy, hits, query, palace_path, wing, room, n_results)
Comment thread mempalace/searcher.py
# before ranking. Default ("vector") is a no-op; "union" merges top-K
# BM25 candidates from sqlite. See `_apply_candidate_strategy`.
_apply_candidate_strategy(candidate_strategy, hits, query, palace_path, wing, room, n_results)

@igorls igorls added area/search Search and retrieval enhancement New feature or request labels May 2, 2026
@igorls igorls merged commit 2ad379b into develop May 3, 2026
6 checks passed
@igorls igorls deleted the feat/hybrid-candidate-union branch May 3, 2026 06:40
- Dedup union candidates by (full_path, chunk_index), not basename —
  two files sharing a basename in different dirs no longer collide,
  and a vector hit on chunk N of a file no longer blocks BM25 from
  contributing chunk M of the same file.
- Validate candidate_strategy at the top of search_memories so invalid
  values fail consistently, not only when the call routes through the
  vector path.
- Trim hits back to n_results after the union+rerank pool grows;
  preserves the existing search_memories size contract that the MCP
  limit parameter is built on.
- Skip BM25-only injection when max_distance > 0.0; BM25-only
  candidates carry distance=None and would silently bypass the
  caller's strict vector-distance threshold.

Adds 4 tests covering: validation under vector_disabled, n_results
trim, max_distance honoring, and basename-collision dedup.
mvalentsev added a commit to mvalentsev/mempalace that referenced this pull request May 3, 2026
MemPalace#1306 (hybrid candidate union, merged 2026-05-02) added a second BM25
scoring site inside _bm25_only_via_sqlite that this PR did not cover --
the original wiring landed before MemPalace#1306 existed. Without the plumbing,
two paths silently bypass locale stop_words even when the palace has
MEMPALACE_LANG set:

- vector_disabled=True (MemPalace#1222 fallback): BM25-only scoring runs without
  filtering.
- candidate_strategy="union": BM25 candidates merged into the rerank
  pool come from a tokenizer that ignored the configured locale, so
  the merged-in entries fight the lang-aware _hybrid_rank rerank.

Resolution moves once: stop_words = _resolve_stop_words(lang) is
hoisted before the vector_disabled branch and threaded through
_apply_candidate_strategy, _merge_bm25_union_candidates, and
_bm25_only_via_sqlite into _bm25_scores.

The FTS5 candidate-selection _tokenize at the top of
_bm25_only_via_sqlite is left untouched -- chromadb's FTS5 index
is built with a trigram tokenizer that already mismatches our
\\w{2,} regex, so dropping further tokens there changes selection
semantics in ways outside this PR's scope.

Three tests pin the propagation: a sqlite-backed test for
_bm25_only_via_sqlite -> _bm25_scores, a unit spy on
_apply_candidate_strategy -> merger, and an end-to-end on
search_memories(vector_disabled=True) -> _bm25_only_via_sqlite.
xcarbo added a commit to xcarbo/mempalace that referenced this pull request May 4, 2026
Catches up on a month of upstream work. Highlights pulled in:

- MemPalace#1306 searcher: hybrid candidate union (vector ∪ BM25 reranking pool)
- MemPalace#1325 mcp security: omit absolute paths from tool_get_drawer / tool_status
- MemPalace#1322 chroma: wire quarantine_stale_hnsw to prevent SIGSEGV on stale HNSW
- MemPalace#1320 mcp: forward valid_to / source params in kg_add / kg_invalidate
- MemPalace#1321 cli: honor --palace flag in cmd_init
- MemPalace#1314 kg temporal params fix
- MemPalace#1244 cli: cmd_compress writes to mempalace_closets so palace can read
- MemPalace#1243 mcp: case-insensitive agent name in diary_write/diary_read
- MemPalace#1303 mcp_server: pass embedding_function= on collection reopen
- MemPalace#1076/MemPalace#1077 hooks: quote CLAUDE_PLUGIN_ROOT / CODEX_PLUGIN_ROOT in hooks.json
- Various ruff format passes on touched files

Conflict resolution (CHANGELOG.md only — code files all auto-merged):

- 3.3.5 unreleased section header from upstream kept above 3.3.4
- 3.3.4 section: kept our 2026-04-30 release date; merged upstream's new
  MemPalace#1299 SIGSEGV-on-default-EF entry in alongside our existing topic-tunnels
  (MemPalace#1194/MemPalace#1195/MemPalace#1197), HNSW-bloat (MemPalace#1191), max_seq_id (MemPalace#1135), and
  auto-ingest (MemPalace#1230/MemPalace#1231) entries. Kept our richer topic-tunnels detail
  (upstream's version was a strict subset).

xdev patches preserved (still on this branch, untouched by merge):
- 6ef44cb fix(hooks): route CC transcripts via convo_miner with cwd-based wings
- 3fad61d fix(config): allow leading dash in wing names

Not pushed to origin — run tests locally and decide when to push.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
mvalentsev added a commit to mvalentsev/mempalace that referenced this pull request May 6, 2026
MemPalace#1306 (hybrid candidate union, merged 2026-05-02) added a second BM25
scoring site inside _bm25_only_via_sqlite that this PR did not cover --
the original wiring landed before MemPalace#1306 existed. Without the plumbing,
two paths silently bypass locale stop_words even when the palace has
MEMPALACE_LANG set:

- vector_disabled=True (MemPalace#1222 fallback): BM25-only scoring runs without
  filtering.
- candidate_strategy="union": BM25 candidates merged into the rerank
  pool come from a tokenizer that ignored the configured locale, so
  the merged-in entries fight the lang-aware _hybrid_rank rerank.

Resolution moves once: stop_words = _resolve_stop_words(lang) is
hoisted before the vector_disabled branch and threaded through
_apply_candidate_strategy, _merge_bm25_union_candidates, and
_bm25_only_via_sqlite into _bm25_scores.

The FTS5 candidate-selection _tokenize at the top of
_bm25_only_via_sqlite is left untouched -- chromadb's FTS5 index
is built with a trigram tokenizer that already mismatches our
\\w{2,} regex, so dropping further tokens there changes selection
semantics in ways outside this PR's scope.

Three tests pin the propagation: a sqlite-backed test for
_bm25_only_via_sqlite -> _bm25_scores, a unit spy on
_apply_candidate_strategy -> merger, and an end-to-end on
search_memories(vector_disabled=True) -> _bm25_only_via_sqlite.
mvalentsev added a commit to mvalentsev/mempalace that referenced this pull request May 7, 2026
scoring site inside _bm25_only_via_sqlite that this PR did not cover --
the original wiring landed before MemPalace#1306 existed. Without the plumbing,
two paths silently bypass locale stop_words even when the palace has
MEMPALACE_LANG set:

- vector_disabled=True (MemPalace#1222 fallback): BM25-only scoring runs without
  filtering.
- candidate_strategy="union": BM25 candidates merged into the rerank
  pool come from a tokenizer that ignored the configured locale, so
  the merged-in entries fight the lang-aware _hybrid_rank rerank.

Resolution moves once: stop_words = _resolve_stop_words(lang) is
hoisted before the vector_disabled branch and threaded through
_apply_candidate_strategy, _merge_bm25_union_candidates, and
_bm25_only_via_sqlite into _bm25_scores.

The FTS5 candidate-selection _tokenize at the top of
_bm25_only_via_sqlite is left untouched -- chromadb's FTS5 index
is built with a trigram tokenizer that already mismatches our
\\w{2,} regex, so dropping further tokens there changes selection
semantics in ways outside this PR's scope.

Three tests pin the propagation: a sqlite-backed test for
_bm25_only_via_sqlite -> _bm25_scores, a unit spy on
_apply_candidate_strategy -> merger, and an end-to-end on
search_memories(vector_disabled=True) -> _bm25_only_via_sqlite.
mvalentsev added a commit to mvalentsev/mempalace that referenced this pull request May 7, 2026
scoring site inside _bm25_only_via_sqlite that this PR did not cover --
the original wiring landed before MemPalace#1306 existed. Without the plumbing,
two paths silently bypass locale stop_words even when the palace has
MEMPALACE_LANG set:

- vector_disabled=True (MemPalace#1222 fallback): BM25-only scoring runs without
  filtering.
- candidate_strategy="union": BM25 candidates merged into the rerank
  pool come from a tokenizer that ignored the configured locale, so
  the merged-in entries fight the lang-aware _hybrid_rank rerank.

Resolution moves once: stop_words = _resolve_stop_words(lang) is
hoisted before the vector_disabled branch and threaded through
_apply_candidate_strategy, _merge_bm25_union_candidates, and
_bm25_only_via_sqlite into _bm25_scores.

The FTS5 candidate-selection _tokenize at the top of
_bm25_only_via_sqlite is left untouched -- chromadb's FTS5 index
is built with a trigram tokenizer that already mismatches our
\\w{2,} regex, so dropping further tokens there changes selection
semantics in ways outside this PR's scope.

Three tests pin the propagation: a sqlite-backed test for
_bm25_only_via_sqlite -> _bm25_scores, a unit spy on
_apply_candidate_strategy -> merger, and an end-to-end on
search_memories(vector_disabled=True) -> _bm25_only_via_sqlite.
mvalentsev added a commit to mvalentsev/mempalace that referenced this pull request May 10, 2026
scoring site inside _bm25_only_via_sqlite that this PR did not cover --
the original wiring landed before MemPalace#1306 existed. Without the plumbing,
two paths silently bypass locale stop_words even when the palace has
MEMPALACE_LANG set:

- vector_disabled=True (MemPalace#1222 fallback): BM25-only scoring runs without
  filtering.
- candidate_strategy="union": BM25 candidates merged into the rerank
  pool come from a tokenizer that ignored the configured locale, so
  the merged-in entries fight the lang-aware _hybrid_rank rerank.

Resolution moves once: stop_words = _resolve_stop_words(lang) is
hoisted before the vector_disabled branch and threaded through
_apply_candidate_strategy, _merge_bm25_union_candidates, and
_bm25_only_via_sqlite into _bm25_scores.

The FTS5 candidate-selection _tokenize at the top of
_bm25_only_via_sqlite is left untouched -- chromadb's FTS5 index
is built with a trigram tokenizer that already mismatches our
\\w{2,} regex, so dropping further tokens there changes selection
semantics in ways outside this PR's scope.

Three tests pin the propagation: a sqlite-backed test for
_bm25_only_via_sqlite -> _bm25_scores, a unit spy on
_apply_candidate_strategy -> merger, and an end-to-end on
search_memories(vector_disabled=True) -> _bm25_only_via_sqlite.
mvalentsev added a commit to mvalentsev/mempalace that referenced this pull request May 11, 2026
scoring site inside _bm25_only_via_sqlite that this PR did not cover --
the original wiring landed before MemPalace#1306 existed. Without the plumbing,
two paths silently bypass locale stop_words even when the palace has
MEMPALACE_LANG set:

- vector_disabled=True (MemPalace#1222 fallback): BM25-only scoring runs without
  filtering.
- candidate_strategy="union": BM25 candidates merged into the rerank
  pool come from a tokenizer that ignored the configured locale, so
  the merged-in entries fight the lang-aware _hybrid_rank rerank.

Resolution moves once: stop_words = _resolve_stop_words(lang) is
hoisted before the vector_disabled branch and threaded through
_apply_candidate_strategy, _merge_bm25_union_candidates, and
_bm25_only_via_sqlite into _bm25_scores.

The FTS5 candidate-selection _tokenize at the top of
_bm25_only_via_sqlite is left untouched -- chromadb's FTS5 index
is built with a trigram tokenizer that already mismatches our
\\w{2,} regex, so dropping further tokens there changes selection
semantics in ways outside this PR's scope.

Three tests pin the propagation: a sqlite-backed test for
_bm25_only_via_sqlite -> _bm25_scores, a unit spy on
_apply_candidate_strategy -> merger, and an end-to-end on
search_memories(vector_disabled=True) -> _bm25_only_via_sqlite.
mvalentsev added a commit to mvalentsev/mempalace that referenced this pull request May 13, 2026
scoring site inside _bm25_only_via_sqlite that this PR did not cover --
the original wiring landed before MemPalace#1306 existed. Without the plumbing,
two paths silently bypass locale stop_words even when the palace has
MEMPALACE_LANG set:

- vector_disabled=True (MemPalace#1222 fallback): BM25-only scoring runs without
  filtering.
- candidate_strategy="union": BM25 candidates merged into the rerank
  pool come from a tokenizer that ignored the configured locale, so
  the merged-in entries fight the lang-aware _hybrid_rank rerank.

Resolution moves once: stop_words = _resolve_stop_words(lang) is
hoisted before the vector_disabled branch and threaded through
_apply_candidate_strategy, _merge_bm25_union_candidates, and
_bm25_only_via_sqlite into _bm25_scores.

The FTS5 candidate-selection _tokenize at the top of
_bm25_only_via_sqlite is left untouched -- chromadb's FTS5 index
is built with a trigram tokenizer that already mismatches our
\\w{2,} regex, so dropping further tokens there changes selection
semantics in ways outside this PR's scope.

Three tests pin the propagation: a sqlite-backed test for
_bm25_only_via_sqlite -> _bm25_scores, a unit spy on
_apply_candidate_strategy -> merger, and an end-to-end on
search_memories(vector_disabled=True) -> _bm25_only_via_sqlite.
jphein added a commit to techempower-org/mempalace that referenced this pull request May 14, 2026
…(Phases 3+4)

Adds the 'hybrid' candidate_strategy that fuses three retrieval modes
in a single search_memories call:
  1. vector — existing HNSW candidates (already happens)
  2. BM25 — postgres tsvector candidates via _bm25_only_via_postgres
  3. graph — AGE knowledge-graph expansion through two channels:
     a. vector-seeded: top-5 vector drawers → entities they reference →
        other drawers about the same entities
     b. NER: cheap capitalized-multi-word regex on query → entity match
        in AGE → connected drawers

All three sources feed the existing hybrid reranker (MemPalace#1306 architecture).
BM25-only and graph-only candidates carry distance=None so the reranker
scores them on their other signals; graph candidates get a small
closet_boost (0.05) to reflect their structural-relevance signal.

New helpers in searcher.py:
- _bm25_only_via_postgres (Phase 2 — already shipped)
- _graph_expand_from_seeds (Phase 3) — vector-seeded entity expansion
- _graph_expand_from_entities (Phase 3) — NER-driven entity expansion
- _ner_from_query (Phase 3) — capitalized-multi-word + known-entity regex
- _merge_hybrid_candidates (Phase 4) — orchestrates all three

mempalace_search MCP tool gains candidate_strategy and include_trace
parameters. Schema updated so unknown kwargs aren't stripped by the
whitelist filter. include_trace surfaces per-source matched_via counts
for debugging.

Verified end-to-end against live postgres palace: hybrid /search ran
in 809ms returning 15 candidates reranked to 5; all top hits relevant
to the test query.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
mvalentsev added a commit to mvalentsev/mempalace that referenced this pull request May 15, 2026
scoring site inside _bm25_only_via_sqlite that this PR did not cover --
the original wiring landed before MemPalace#1306 existed. Without the plumbing,
two paths silently bypass locale stop_words even when the palace has
MEMPALACE_LANG set:

- vector_disabled=True (MemPalace#1222 fallback): BM25-only scoring runs without
  filtering.
- candidate_strategy="union": BM25 candidates merged into the rerank
  pool come from a tokenizer that ignored the configured locale, so
  the merged-in entries fight the lang-aware _hybrid_rank rerank.

Resolution moves once: stop_words = _resolve_stop_words(lang) is
hoisted before the vector_disabled branch and threaded through
_apply_candidate_strategy, _merge_bm25_union_candidates, and
_bm25_only_via_sqlite into _bm25_scores.

The FTS5 candidate-selection _tokenize at the top of
_bm25_only_via_sqlite is left untouched -- chromadb's FTS5 index
is built with a trigram tokenizer that already mismatches our
\\w{2,} regex, so dropping further tokens there changes selection
semantics in ways outside this PR's scope.

Three tests pin the propagation: a sqlite-backed test for
_bm25_only_via_sqlite -> _bm25_scores, a unit spy on
_apply_candidate_strategy -> merger, and an end-to-end on
search_memories(vector_disabled=True) -> _bm25_only_via_sqlite.
mvalentsev added a commit to mvalentsev/mempalace that referenced this pull request May 17, 2026
scoring site inside _bm25_only_via_sqlite that this PR did not cover --
the original wiring landed before MemPalace#1306 existed. Without the plumbing,
two paths silently bypass locale stop_words even when the palace has
MEMPALACE_LANG set:

- vector_disabled=True (MemPalace#1222 fallback): BM25-only scoring runs without
  filtering.
- candidate_strategy="union": BM25 candidates merged into the rerank
  pool come from a tokenizer that ignored the configured locale, so
  the merged-in entries fight the lang-aware _hybrid_rank rerank.

Resolution moves once: stop_words = _resolve_stop_words(lang) is
hoisted before the vector_disabled branch and threaded through
_apply_candidate_strategy, _merge_bm25_union_candidates, and
_bm25_only_via_sqlite into _bm25_scores.

The FTS5 candidate-selection _tokenize at the top of
_bm25_only_via_sqlite is left untouched -- chromadb's FTS5 index
is built with a trigram tokenizer that already mismatches our
\\w{2,} regex, so dropping further tokens there changes selection
semantics in ways outside this PR's scope.

Three tests pin the propagation: a sqlite-backed test for
_bm25_only_via_sqlite -> _bm25_scores, a unit spy on
_apply_candidate_strategy -> merger, and an end-to-end on
search_memories(vector_disabled=True) -> _bm25_only_via_sqlite.
mvalentsev added a commit to mvalentsev/mempalace that referenced this pull request May 18, 2026
scoring site inside _bm25_only_via_sqlite that this PR did not cover --
the original wiring landed before MemPalace#1306 existed. Without the plumbing,
two paths silently bypass locale stop_words even when the palace has
MEMPALACE_LANG set:

- vector_disabled=True (MemPalace#1222 fallback): BM25-only scoring runs without
  filtering.
- candidate_strategy="union": BM25 candidates merged into the rerank
  pool come from a tokenizer that ignored the configured locale, so
  the merged-in entries fight the lang-aware _hybrid_rank rerank.

Resolution moves once: stop_words = _resolve_stop_words(lang) is
hoisted before the vector_disabled branch and threaded through
_apply_candidate_strategy, _merge_bm25_union_candidates, and
_bm25_only_via_sqlite into _bm25_scores.

The FTS5 candidate-selection _tokenize at the top of
_bm25_only_via_sqlite is left untouched -- chromadb's FTS5 index
is built with a trigram tokenizer that already mismatches our
\\w{2,} regex, so dropping further tokens there changes selection
semantics in ways outside this PR's scope.

Three tests pin the propagation: a sqlite-backed test for
_bm25_only_via_sqlite -> _bm25_scores, a unit spy on
_apply_candidate_strategy -> merger, and an end-to-end on
search_memories(vector_disabled=True) -> _bm25_only_via_sqlite.
mvalentsev added a commit to mvalentsev/mempalace that referenced this pull request May 18, 2026
scoring site inside _bm25_only_via_sqlite that this PR did not cover --
the original wiring landed before MemPalace#1306 existed. Without the plumbing,
two paths silently bypass locale stop_words even when the palace has
MEMPALACE_LANG set:

- vector_disabled=True (MemPalace#1222 fallback): BM25-only scoring runs without
  filtering.
- candidate_strategy="union": BM25 candidates merged into the rerank
  pool come from a tokenizer that ignored the configured locale, so
  the merged-in entries fight the lang-aware _hybrid_rank rerank.

Resolution moves once: stop_words = _resolve_stop_words(lang) is
hoisted before the vector_disabled branch and threaded through
_apply_candidate_strategy, _merge_bm25_union_candidates, and
_bm25_only_via_sqlite into _bm25_scores.

The FTS5 candidate-selection _tokenize at the top of
_bm25_only_via_sqlite is left untouched -- chromadb's FTS5 index
is built with a trigram tokenizer that already mismatches our
\\w{2,} regex, so dropping further tokens there changes selection
semantics in ways outside this PR's scope.

Three tests pin the propagation: a sqlite-backed test for
_bm25_only_via_sqlite -> _bm25_scores, a unit spy on
_apply_candidate_strategy -> merger, and an end-to-end on
search_memories(vector_disabled=True) -> _bm25_only_via_sqlite.
mvalentsev added a commit to mvalentsev/mempalace that referenced this pull request May 20, 2026
scoring site inside _bm25_only_via_sqlite that this PR did not cover --
the original wiring landed before MemPalace#1306 existed. Without the plumbing,
two paths silently bypass locale stop_words even when the palace has
MEMPALACE_LANG set:

- vector_disabled=True (MemPalace#1222 fallback): BM25-only scoring runs without
  filtering.
- candidate_strategy="union": BM25 candidates merged into the rerank
  pool come from a tokenizer that ignored the configured locale, so
  the merged-in entries fight the lang-aware _hybrid_rank rerank.

Resolution moves once: stop_words = _resolve_stop_words(lang) is
hoisted before the vector_disabled branch and threaded through
_apply_candidate_strategy, _merge_bm25_union_candidates, and
_bm25_only_via_sqlite into _bm25_scores.

The FTS5 candidate-selection _tokenize at the top of
_bm25_only_via_sqlite is left untouched -- chromadb's FTS5 index
is built with a trigram tokenizer that already mismatches our
\\w{2,} regex, so dropping further tokens there changes selection
semantics in ways outside this PR's scope.

Three tests pin the propagation: a sqlite-backed test for
_bm25_only_via_sqlite -> _bm25_scores, a unit spy on
_apply_candidate_strategy -> merger, and an end-to-end on
search_memories(vector_disabled=True) -> _bm25_only_via_sqlite.
mvalentsev added a commit to mvalentsev/mempalace that referenced this pull request May 21, 2026
scoring site inside _bm25_only_via_sqlite that this PR did not cover --
the original wiring landed before MemPalace#1306 existed. Without the plumbing,
two paths silently bypass locale stop_words even when the palace has
MEMPALACE_LANG set:

- vector_disabled=True (MemPalace#1222 fallback): BM25-only scoring runs without
  filtering.
- candidate_strategy="union": BM25 candidates merged into the rerank
  pool come from a tokenizer that ignored the configured locale, so
  the merged-in entries fight the lang-aware _hybrid_rank rerank.

Resolution moves once: stop_words = _resolve_stop_words(lang) is
hoisted before the vector_disabled branch and threaded through
_apply_candidate_strategy, _merge_bm25_union_candidates, and
_bm25_only_via_sqlite into _bm25_scores.

The FTS5 candidate-selection _tokenize at the top of
_bm25_only_via_sqlite is left untouched -- chromadb's FTS5 index
is built with a trigram tokenizer that already mismatches our
\\w{2,} regex, so dropping further tokens there changes selection
semantics in ways outside this PR's scope.

Three tests pin the propagation: a sqlite-backed test for
_bm25_only_via_sqlite -> _bm25_scores, a unit spy on
_apply_candidate_strategy -> merger, and an end-to-end on
search_memories(vector_disabled=True) -> _bm25_only_via_sqlite.
mvalentsev added a commit to mvalentsev/mempalace that referenced this pull request May 22, 2026
scoring site inside _bm25_only_via_sqlite that this PR did not cover --
the original wiring landed before MemPalace#1306 existed. Without the plumbing,
two paths silently bypass locale stop_words even when the palace has
MEMPALACE_LANG set:

- vector_disabled=True (MemPalace#1222 fallback): BM25-only scoring runs without
  filtering.
- candidate_strategy="union": BM25 candidates merged into the rerank
  pool come from a tokenizer that ignored the configured locale, so
  the merged-in entries fight the lang-aware _hybrid_rank rerank.

Resolution moves once: stop_words = _resolve_stop_words(lang) is
hoisted before the vector_disabled branch and threaded through
_apply_candidate_strategy, _merge_bm25_union_candidates, and
_bm25_only_via_sqlite into _bm25_scores.

The FTS5 candidate-selection _tokenize at the top of
_bm25_only_via_sqlite is left untouched -- chromadb's FTS5 index
is built with a trigram tokenizer that already mismatches our
\\w{2,} regex, so dropping further tokens there changes selection
semantics in ways outside this PR's scope.

Three tests pin the propagation: a sqlite-backed test for
_bm25_only_via_sqlite -> _bm25_scores, a unit spy on
_apply_candidate_strategy -> merger, and an end-to-end on
search_memories(vector_disabled=True) -> _bm25_only_via_sqlite.
mvalentsev added a commit to mvalentsev/mempalace that referenced this pull request May 23, 2026
scoring site inside _bm25_only_via_sqlite that this PR did not cover --
the original wiring landed before MemPalace#1306 existed. Without the plumbing,
two paths silently bypass locale stop_words even when the palace has
MEMPALACE_LANG set:

- vector_disabled=True (MemPalace#1222 fallback): BM25-only scoring runs without
  filtering.
- candidate_strategy="union": BM25 candidates merged into the rerank
  pool come from a tokenizer that ignored the configured locale, so
  the merged-in entries fight the lang-aware _hybrid_rank rerank.

Resolution moves once: stop_words = _resolve_stop_words(lang) is
hoisted before the vector_disabled branch and threaded through
_apply_candidate_strategy, _merge_bm25_union_candidates, and
_bm25_only_via_sqlite into _bm25_scores.

The FTS5 candidate-selection _tokenize at the top of
_bm25_only_via_sqlite is left untouched -- chromadb's FTS5 index
is built with a trigram tokenizer that already mismatches our
\\w{2,} regex, so dropping further tokens there changes selection
semantics in ways outside this PR's scope.

Three tests pin the propagation: a sqlite-backed test for
_bm25_only_via_sqlite -> _bm25_scores, a unit spy on
_apply_candidate_strategy -> merger, and an end-to-end on
search_memories(vector_disabled=True) -> _bm25_only_via_sqlite.
mvalentsev added a commit to mvalentsev/mempalace that referenced this pull request May 24, 2026
scoring site inside _bm25_only_via_sqlite that this PR did not cover --
the original wiring landed before MemPalace#1306 existed. Without the plumbing,
two paths silently bypass locale stop_words even when the palace has
MEMPALACE_LANG set:

- vector_disabled=True (MemPalace#1222 fallback): BM25-only scoring runs without
  filtering.
- candidate_strategy="union": BM25 candidates merged into the rerank
  pool come from a tokenizer that ignored the configured locale, so
  the merged-in entries fight the lang-aware _hybrid_rank rerank.

Resolution moves once: stop_words = _resolve_stop_words(lang) is
hoisted before the vector_disabled branch and threaded through
_apply_candidate_strategy, _merge_bm25_union_candidates, and
_bm25_only_via_sqlite into _bm25_scores.

The FTS5 candidate-selection _tokenize at the top of
_bm25_only_via_sqlite is left untouched -- chromadb's FTS5 index
is built with a trigram tokenizer that already mismatches our
\\w{2,} regex, so dropping further tokens there changes selection
semantics in ways outside this PR's scope.

Three tests pin the propagation: a sqlite-backed test for
_bm25_only_via_sqlite -> _bm25_scores, a unit spy on
_apply_candidate_strategy -> merger, and an end-to-end on
search_memories(vector_disabled=True) -> _bm25_only_via_sqlite.
mvalentsev added a commit to mvalentsev/mempalace that referenced this pull request May 24, 2026
scoring site inside _bm25_only_via_sqlite that this PR did not cover --
the original wiring landed before MemPalace#1306 existed. Without the plumbing,
two paths silently bypass locale stop_words even when the palace has
MEMPALACE_LANG set:

- vector_disabled=True (MemPalace#1222 fallback): BM25-only scoring runs without
  filtering.
- candidate_strategy="union": BM25 candidates merged into the rerank
  pool come from a tokenizer that ignored the configured locale, so
  the merged-in entries fight the lang-aware _hybrid_rank rerank.

Resolution moves once: stop_words = _resolve_stop_words(lang) is
hoisted before the vector_disabled branch and threaded through
_apply_candidate_strategy, _merge_bm25_union_candidates, and
_bm25_only_via_sqlite into _bm25_scores.

The FTS5 candidate-selection _tokenize at the top of
_bm25_only_via_sqlite is left untouched -- chromadb's FTS5 index
is built with a trigram tokenizer that already mismatches our
\\w{2,} regex, so dropping further tokens there changes selection
semantics in ways outside this PR's scope.

Three tests pin the propagation: a sqlite-backed test for
_bm25_only_via_sqlite -> _bm25_scores, a unit spy on
_apply_candidate_strategy -> merger, and an end-to-end on
search_memories(vector_disabled=True) -> _bm25_only_via_sqlite.
mvalentsev added a commit to mvalentsev/mempalace that referenced this pull request May 24, 2026
scoring site inside _bm25_only_via_sqlite that this PR did not cover --
the original wiring landed before MemPalace#1306 existed. Without the plumbing,
two paths silently bypass locale stop_words even when the palace has
MEMPALACE_LANG set:

- vector_disabled=True (MemPalace#1222 fallback): BM25-only scoring runs without
  filtering.
- candidate_strategy="union": BM25 candidates merged into the rerank
  pool come from a tokenizer that ignored the configured locale, so
  the merged-in entries fight the lang-aware _hybrid_rank rerank.

Resolution moves once: stop_words = _resolve_stop_words(lang) is
hoisted before the vector_disabled branch and threaded through
_apply_candidate_strategy, _merge_bm25_union_candidates, and
_bm25_only_via_sqlite into _bm25_scores.

The FTS5 candidate-selection _tokenize at the top of
_bm25_only_via_sqlite is left untouched -- chromadb's FTS5 index
is built with a trigram tokenizer that already mismatches our
\\w{2,} regex, so dropping further tokens there changes selection
semantics in ways outside this PR's scope.

Three tests pin the propagation: a sqlite-backed test for
_bm25_only_via_sqlite -> _bm25_scores, a unit spy on
_apply_candidate_strategy -> merger, and an end-to-end on
search_memories(vector_disabled=True) -> _bm25_only_via_sqlite.
rglaubitz pushed a commit to rglaubitz/mempalace that referenced this pull request May 26, 2026
- Dedup union candidates by (full_path, chunk_index), not basename —
  two files sharing a basename in different dirs no longer collide,
  and a vector hit on chunk N of a file no longer blocks BM25 from
  contributing chunk M of the same file.
- Validate candidate_strategy at the top of search_memories so invalid
  values fail consistently, not only when the call routes through the
  vector path.
- Trim hits back to n_results after the union+rerank pool grows;
  preserves the existing search_memories size contract that the MCP
  limit parameter is built on.
- Skip BM25-only injection when max_distance > 0.0; BM25-only
  candidates carry distance=None and would silently bypass the
  caller's strict vector-distance threshold.

Adds 4 tests covering: validation under vector_disabled, n_results
trim, max_distance honoring, and basename-collision dedup.
rglaubitz added a commit to rglaubitz/mempalace that referenced this pull request May 26, 2026
* fix(mcp_server): pass embedding_function= on collection reopen (MemPalace#1299)

`mcp_server._get_collection` bypassed `ChromaBackend.get_collection`
and called `client.get_collection` / `client.create_collection` without
`embedding_function=`. ChromaDB 1.x does not persist the EF identity
with the collection, so the MCP server's reopen silently bound
chromadb's built-in `DefaultEmbeddingFunction` while the miner / Stop
hook ingest path bound `mempalace.embedding.get_embedding_function()`.

On bleeding-edge interpreters (python 3.14 + chromadb 1.5.x on Apple
Silicon, per MemPalace#1299) the default EF's lazy ONNX provider selection could
SIGSEGV the host process on first `col.add()`, killing the MCP stdio
server and leaving every subsequent tool call returning
`Connection closed` until Claude Code was relaunched. Reads worked
because `col.get(ids=...)` and metadata fetches don't invoke the EF;
the auto-ingest path worked because mining routes through the backend
abstraction. Diary writes were the consistent failure surface.

Resolve the EF up front (matching `ChromaBackend._resolve_embedding_function`)
and pass it into both reopen branches. Falls back to the chromadb default
only if `mempalace.embedding.get_embedding_function` itself raises.

Regression test patches the chromadb client class to capture
`embedding_function=` on every `get_collection` / `create_collection`
call from `_get_collection(create=True)` and `_get_collection()`, and
fails if any call omits it.

Follow-up to MemPalace#1262 / MemPalace#1289 (which fixed the metadata-mismatch SIGSEGV
path); this addresses the EF-mismatch SIGSEGV path on the same surface.

* fix(mcp_server): address copilot review on MemPalace#1303

- Resolve the EF inside the two reopen branches that actually call
  `client.get_collection` / `client.create_collection`, so warm-cache
  reads stay zero-cost (no `MempalaceConfig()` / `_resolve_providers`
  on every tool call).
- Reuse `ChromaBackend._resolve_embedding_function()` instead of
  duplicating its try/except + log message + None-fallback.
- Reword the inline + CHANGELOG explanation to clarify that ChromaDB 1.x
  persists the EF *identity* (its `name()`) but not the *instance/
  configuration* — `mempalace.embedding` documents this and spoofs
  `name()` to `"default"` precisely so the identity check passes; the
  bug was the *provider list* (lazy ONNX selection) silently differing.

* fix(hooks): quote CLAUDE_PLUGIN_ROOT / CODEX_PLUGIN_ROOT in hooks.json (MemPalace#1076) (MemPalace#1077)

Shell splits hook command on whitespace after variable expansion, breaking
paths with spaces (e.g. C:\Users\Richard M on Windows). Wrapping the path
in double quotes preserves the token boundary.

Fixes the reported Stop/PreCompact pair in .claude-plugin/hooks/hooks.json
and applies the same fix to .codex-plugin/hooks.json (SessionStart/Stop/
PreCompact), which carries the identical bug.

* fix(cli): write compress output to mempalace_closets so palace can read them (MemPalace#1244)

`cmd_compress` was writing AAAK-compressed drawers to a `mempalace_compressed`
collection, but every read path (`palace.get_closets_collection`,
`searcher.py`, `repair.py`) reads from `mempalace_closets`. Result: for
non-mined palaces (or any palace where the user ran `mempalace compress`
expecting to backfill the closet/index layer), the compressed output was
silently invisible — written to a collection nothing else opens.

Fix the writer rather than renaming the readers: "closets" is the
user-visible feature name baked into the public API
(`get_closets_collection`), the searcher hybrid path, repair/HNSW
diagnostics, and docs. Renaming the readers would churn 15+ call sites
and the README for no benefit. The compressed AAAK strings are exactly
what closets are conceptually — compact pointers scanned by an LLM to
locate the right drawer — so they belong in `mempalace_closets`.

Tests:
- Update `test_cmd_compress_stores_results` to assert the collection
  name passed to `get_or_create_collection` is `mempalace_closets`.
- Add `test_cmd_compress_output_readable_via_get_closets_collection`:
  end-to-end with a real ChromaBackend, seed a drawer, run cmd_compress,
  then read back via the same `get_closets_collection` helper that
  palace.py / searcher use. Regression test for the wrong-collection
  bug.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* style: ruff format cli.py (MemPalace#1244)

CI requires ruff format --check on the whole touched file. Pre-existing drift, no logic change.

* style: ruff format tests/test_cli.py (PR MemPalace#1319)

* fix(mcp): case-insensitive agent name in diary_write/diary_read (MemPalace#1243)

`tool_diary_write` stored the `agent` metadata verbatim after `sanitize_name`
(which preserves case), while `tool_diary_read` filtered by exact match —
so writing as "Claude" and reading as "claude" silently returned zero rows.

Both endpoints now lowercase `agent_name` immediately after sanitization.
The default per-agent wing slug is also stable across casings since it's
derived from the same normalized form.

Behavior change: entries written prior to this fix under mixed-case agent
names will not match the new lowercase filter; documented under v3.3.5
in CHANGELOG with a `mempalace repair` pointer.

Adds a regression test (`test_diary_read_case_insensitive_agent`) and
updates the existing `test_diary_write_and_read` to assert the new
lowercase agent identity.

Closes MemPalace#1243

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* style: ruff format tests/test_mcp_server.py (PR MemPalace#1323)

* fix(backends/chroma): wire quarantine_stale_hnsw into _client() to prevent SIGSEGV on stale HNSW (MemPalace#1121, MemPalace#1132, MemPalace#1263)

PR MemPalace#1173 wired quarantine_stale_hnsw into the static make_client() helper
but not into the instance _client() method. As a result every non-MCP
entry point (CLI mining, search, repair, status) — which all use
get_collection / _get_or_create_collection / _client() — skipped the
cold-start quarantine pass and could SIGSEGV on a stale HNSW segment
left over from a partial flush, replicated palace, or crashed-mid-write.

Refactor: extract the (_fix_blob_seq_ids + gated quarantine_stale_hnsw)
pre-open pass into a single private static helper
ChromaBackend._prepare_palace_for_open(). Both make_client() and
_client() now route through it, so the _quarantined_paths once-per-
palace-per-process gate is preserved (no runtime thrash on hot paths)
and behaviour stays identical — the fix is purely about extending the
existing protection to the path that was missing it.

Tests:

- test_client_quarantines_corrupt_segment_on_first_open mirrors the
  existing make_client test and verifies _client() actually renames a
  corrupt segment on first open.
- test_client_quarantines_only_on_first_call_per_palace verifies the
  cache gate prevents re-running quarantine across repeated _client()
  calls — important because _client() is hit on every backend op.

Closes MemPalace#1121. Closes MemPalace#1132. Closes MemPalace#1263.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* style: ruff format touched files (PR MemPalace#1322)

CI requires whole-file format on touched files; pre-existing drift only.

* style: ruff format tests/test_backends.py (PR MemPalace#1322)

* fix(mcp): omit palace_path from tool_status responses (+ docs)

The MCP `mempalace_status` tool was returning the server's absolute
`_config.palace_path` to any connected client on both the main
(ChromaDB-backed) path and the sqlite fallback path that runs when
HNSW divergence is detected (MemPalace#1222). On a single-user local deployment
this is self-disclosure, but in nested-agent or multi-server MCP
topologies the client is a separate trust domain and the absolute
path has no documented client-side use.

Clients that legitimately need the palace path continue to have three
documented channels: the `MEMPALACE_PALACE_PATH` env var (primary) or
its legacy `MEMPAL_PALACE_PATH` alias, the `~/.mempalace/config.json`
file, and the `--palace` CLI flag on most subcommands.

Also corrects stale docs that claimed `mempalace_reconnect` returned a
`palace_path` field; the code returns `{success, message, drawers,
vector_disabled[, vector_disabled_reason]}` on success, plus a no-palace
shape and an exception shape.

- mempalace/mcp_server.py: drop palace_path from tool_status() and
  _tool_status_via_sqlite() result dicts
- website/reference/mcp-tools.md: update documented return shapes for
  mempalace_status (fix) and mempalace_reconnect (stale-docs correction)

Authored-by: Aaron Salsitz (ICCI LLC, @icciaaron). Claude Code was used
as an authoring and review-orchestration tool, with human-in-the-loop
oversight at every step: Aaron wrote the prompts, reviewed each draft,
called for three independent review passes (drafting / post-rebase
technical / CISA-aligned disclosure-leak), and verified the final patch
behavior before commit.

* fix(mcp): basename source_file in tool_get_drawer responses

The MCP `mempalace_get_drawer` tool returned the entire raw drawer
metadata blob to any connected client, and the `source_file` field
in that blob is the absolute filesystem path written by the miners
(`miner.py`, `convo_miner.py` — `source_file = str(filepath)`). On
a single-user local deployment this is self-disclosure, but in
nested-agent or multi-server MCP topologies the client is a separate
trust domain and the host's directory layout has no documented
client-side use.

Mirror the mitigation that `searcher.search_memories()` already applies
on its own return path: reduce `source_file` to its basename via
`Path(source_file).name` before handing the metadata to the client.
Citations still work — the directory layout does not leak.

Companion to #1 (omit palace_path from tool_status). Same threat class,
different surface:

- mempalace_status — palace dir path     → fixed in #1
- mempalace_get_drawer — per-drawer source_file path → this PR

Other read tools were audited and do not leak host paths:
- mempalace_search    — already basenames source_file
- mempalace_list_drawers — returns wing/room/preview only
- mempalace_diary_read   — date/timestamp/topic/content only
- mempalace_reconnect    — success/message/drawers only
- mempalace_kg_*         — entity/predicate strings, counts
- mempalace_check_duplicate — wing/room/preview only

Changes:
- mempalace/mcp_server.py: tool_get_drawer() now basenames metadata.source_file
- tests/test_mcp_server.py: regression test asserting the absolute path
  and its parent directory do not appear anywhere in the response
- website/reference/mcp-tools.md: clarify the documented return shape

* feat(searcher): add candidate_strategy="union" for vector∪BM25 reranking pool

Default search behavior is unchanged. Opt-in candidate_strategy="union"
also pulls top-K BM25-only candidates from sqlite FTS5 and merges them
into the rerank pool, catching docs with strong BM25 signal that the
vector index didn't surface in the over-fetch window.

Motivation
----------
The current hybrid path gathers candidates from the vector index only
(n_results * 3 over-fetch), then BM25-reranks within them. When the
query embeds close to the wrong content semantically, the right doc
never enters the rerank pool — *no matter how wide the over-fetch*.
Tested on a ~6K-document mixed corpus (knowledge prose + short structured
records): at *30x* over-fetch (~5% of the corpus) the target doc still
didn't surface for narrative-shaped queries targeting terminology guides.
Wider over-fetch isn't the answer; widening the pool's *source* is.

Concrete failure mode: a narrative-shaped query embeds close to records
sharing the same operational vocabulary (other narrative entries in the
corpus). A terminology / style guide is BM25-strong for the query
(rare keywords the guide repeats) but vector-distant. Vector-only
candidates don't include it; BM25 never gets to rerank it. The hybrid
path produces 0.00 recall on a probe that pure BM25 alone scores 1.00 —
the hybrid is worse than its component on the same input.

Behavior change
---------------
* New parameter ``candidate_strategy: str = "vector"`` on ``search_memories``.
  - ``"vector"`` (default): historical behavior, no change.
  - ``"union"``: also fetch top ``n_results * 3`` candidates via the
    existing ``_bm25_only_via_sqlite`` helper, dedupe by source_file,
    merge into the rerank pool. BM25-only candidates carry
    ``distance=None`` so they're scored on BM25 contribution alone
    (vec_sim coerces to 0).
* ``_hybrid_rank`` now handles ``distance=None`` explicitly, scoring
  such candidates as vector-unknown (vec_sim=0) rather than treating
  it as max-distance via shim.
* New strategies register via ``_CANDIDATE_MERGERS``; dispatch is in
  ``_apply_candidate_strategy`` so ``search_memories`` stays under the
  C901 complexity ceiling.

Bench numbers (~6K-doc internal mixed corpus, recall@10, 5 probes spanning
policy-exception lookup, temporal-decay, style retrieval, set-difference,
and pattern-recognition):

                              baseline ("vector")   "union"
  policy-exception probe        0.00                  0.50    +0.50
  temporal-decay probe          0.17                  0.50    +0.33
  style-retrieval probe         0.00                  1.00    +1.00 (PASSES)
  set-difference probe          0.00–0.06             0.06–0.09  ~
  pattern-recog probe           0.64 (stable)         0.50–0.71  variance, typ. +0.07
  macro recall                  0.16–0.17             0.51–0.56  +0.34 to +0.40

The pattern-recog variance points at a related issue worth a separate PR:
``_hybrid_rank`` computes BM25 IDF over the candidate set. Adding new
candidates re-normalizes BM25 for *existing* candidates non-monotonically.
Stable corpus-wide BM25 would remove this. Out of scope here.

Tests
-----
``tests/test_hybrid_candidate_union.py`` (6 tests, all pass):
- default behavior unchanged (explicit ``"vector"`` matches default)
- ``"union"`` surfaces a BM25-strong vector-distant doc
- ``"union"`` doesn't drop docs ``"vector"`` would have found
- empty-palace handling
- invalid ``candidate_strategy`` raises
- ``_hybrid_rank`` tolerates ``distance=None``

Existing ``test_hybrid_search.py`` (5) and ``test_searcher.py`` (27) pass.

Performance note
----------------
Each ``"union"`` query adds one sqlite open + FTS5 MATCH + metadata
fetch (via the existing ``_bm25_only_via_sqlite`` helper, which already
runs as the ``vector_disabled`` fallback path so the code is
well-trodden). Per-query overhead is small but unmeasured at corpus
scale. Default stays ``"vector"`` until a maintainer characterizes the
cost.

* fix(searcher): address Copilot review on MemPalace#1306

- Dedup union candidates by (full_path, chunk_index), not basename —
  two files sharing a basename in different dirs no longer collide,
  and a vector hit on chunk N of a file no longer blocks BM25 from
  contributing chunk M of the same file.
- Validate candidate_strategy at the top of search_memories so invalid
  values fail consistently, not only when the call routes through the
  vector path.
- Trim hits back to n_results after the union+rerank pool grows;
  preserves the existing search_memories size contract that the MCP
  limit parameter is built on.
- Skip BM25-only injection when max_distance > 0.0; BM25-only
  candidates carry distance=None and would silently bypass the
  caller's strict vector-distance threshold.

Adds 4 tests covering: validation under vector_disabled, n_results
trim, max_distance honoring, and basename-collision dedup.

* fix(mcp): forward valid_to and source params in kg_add/kg_invalidate (MemPalace#1314)

`tool_kg_add` previously accepted only `valid_from` and `source_closet`,
silently dropping `valid_to`, `source_file`, and `source_drawer_id` at
the MCP boundary. Backfilling already-ended historical facts therefore
collapsed to "still current," and adapter provenance never reached
the SQLite layer even though `KnowledgeGraph.add_triple` already
supported every column.

`tool_kg_invalidate` returned the literal string `"today"` whenever the
caller omitted `ended`, hiding the actual stamped date from anyone trying
to verify what got persisted.

Changes:
- Extend `tool_kg_add` signature + MCP input_schema with `valid_to`,
  `source_file`, `source_drawer_id`; forward all of them to
  `_kg.add_triple` and to the WAL log.
- Resolve `ended` to `date.today().isoformat()` in `tool_kg_invalidate`
  before logging / returning, so the response always reports the actual
  date stored in `valid_to`.
- Add regression tests for valid_to round-trip, source_file /
  source_drawer_id provenance, and the resolved-ended-date contract.
- Leave TODO(MemPalace#1283) markers so the open ISO-8601 validation PR can drop
  `validate_iso_date` over `valid_from` / `valid_to` / `ended` cleanly.

The underlying `KnowledgeGraph.add_triple` already accepted these
kwargs (RFC 002 §5.5) — only the MCP edge needed wiring up.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* style: ruff format test_mcp_server.py (PR MemPalace#1320)

* docs(mcp): drop §5.5 from kg_add docstring/schema

The repo's anti-jargon meta-test bans §N markers outside the
sources/backends allowlist. mcp_server.py isn't allowlisted, so the
"RFC 002 §5.5" references added in this PR turned the test red.
Trim to "RFC 002" — section number isn't load-bearing for the description.

* fix(cli): honor --palace flag in cmd_init (MemPalace#1313)

cmd_init was instantiating MempalaceConfig() unconditionally, ignoring
args.palace and always writing the palace under ~/.mempalace. Mirror
the env-var pattern used by mcp_server.py (and consistent with how
cmd_mine / cmd_status / cmd_search resolve --palace) so every
downstream read of cfg.palace_path inside cmd_init — Pass 0,
cfg.init(), and the post-init mine — routes to the user-specified
location.

Adds tests/test_cli.py::test_cmd_init_honors_palace_flag covering the
regression: asserts Pass 0 receives the --palace value (not
~/.mempalace) and that MEMPALACE_PALACE_PATH is set in os.environ.

Closes MemPalace#1313.

* test(cli): prime monkeypatch undo so palace env doesn't leak

monkeypatch.delenv(name, raising=False) on a missing key registers no
undo entry, so the env var cmd_init writes leaked into test_config_from_file
on Python 3.13 / Windows / macOS.

Prime the slot with setenv before delenv so teardown rolls back the write.

* feat(mcp): persist drawer metadata

* fix: address metadata passthrough review blockers

* Close metadata passthrough review findings

F03 now logs corrupted drawer metadata decode failures through the mempalace_mcp logger, truncating the stored value to value_head so operators get an identifier without leaking the full payload.

F15 is Case B: ChromaBackend._resolve_embedding_function() can return None because it catches any exception from get_embedding_function() and returns None after logging; ChromaBackend.get_collection() mirrors the same conditional kwargs behavior and does not repair that case.

For the MCP _get_collection call site, treat a None EF as a fail-closed condition and never reopen a collection without embedding_function=, so the MemPalace#1299 default-EF regression path cannot silently reappear there.

Validation: uv run pytest -q; uv run ruff check .

---------

Co-authored-by: Igor Lins e Silva <4753812+igorls@users.noreply.github.com>
Co-authored-by: Mikhail Valentsev <michael@valentsev.ru>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: icciAaron <asalsitz@icci.com>
mvalentsev added a commit to mvalentsev/mempalace that referenced this pull request May 30, 2026
scoring site inside _bm25_only_via_sqlite that this PR did not cover --
the original wiring landed before MemPalace#1306 existed. Without the plumbing,
two paths silently bypass locale stop_words even when the palace has
MEMPALACE_LANG set:

- vector_disabled=True (MemPalace#1222 fallback): BM25-only scoring runs without
  filtering.
- candidate_strategy="union": BM25 candidates merged into the rerank
  pool come from a tokenizer that ignored the configured locale, so
  the merged-in entries fight the lang-aware _hybrid_rank rerank.

Resolution moves once: stop_words = _resolve_stop_words(lang) is
hoisted before the vector_disabled branch and threaded through
_apply_candidate_strategy, _merge_bm25_union_candidates, and
_bm25_only_via_sqlite into _bm25_scores.

The FTS5 candidate-selection _tokenize at the top of
_bm25_only_via_sqlite is left untouched -- chromadb's FTS5 index
is built with a trigram tokenizer that already mismatches our
\\w{2,} regex, so dropping further tokens there changes selection
semantics in ways outside this PR's scope.

Three tests pin the propagation: a sqlite-backed test for
_bm25_only_via_sqlite -> _bm25_scores, a unit spy on
_apply_candidate_strategy -> merger, and an end-to-end on
search_memories(vector_disabled=True) -> _bm25_only_via_sqlite.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/search Search and retrieval enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants