fix: degrade gracefully when filtered Chroma search fails#951
Open
Dialectician wants to merge 4 commits into
Open
fix: degrade gracefully when filtered Chroma search fails#951Dialectician wants to merge 4 commits into
Dialectician wants to merge 4 commits into
Conversation
…arch and add_drawer
Add three capabilities that enable programmatic memory consumers (like MCP-based agent tools) to store and retrieve structured metadata alongside drawer content:
1. `mempalace_add_drawer` accepts optional `metadata` dict — custom key/value pairs stored in ChromaDB metadata alongside built-in fields (wing, room, source_file, etc.). Values must be str/int/float/bool per ChromaDB requirements. Built-in fields cannot be overridden.
2. `mempalace_search` accepts optional `where` dict — passed through to ChromaDB's where parameter for metadata filtering. Supports all ChromaDB operators ($eq, $gt, $gte, $lt, $lte, $in, $nin, $and, $or). Combined with wing/room filters via $and.
3. `mempalace_search` accepts optional `sort_by` parameter — "relevance" (default, current behavior) or "recency" (sort by filed_at descending after ChromaDB returns similarity results).
CLI also updated: `mempalace search --where '{"category":"session-handoff"}' --sort recency`
All 89 existing tests pass. Test fixtures updated for new parameter signatures.
…data-filtering # Conflicts: # mempalace/mcp_server.py # mempalace/searcher.py
This was referenced Apr 18, 2026
jphein
added a commit
to techempower-org/mempalace
that referenced
this pull request
Apr 18, 2026
…erdelivers When ChromaDB's HNSW index is sparse, drifted, or rejects a filter (MemPalace#951's "Error finding id"), the current search path returns fewer hits than the palace actually contains -- silently. A user querying for drawers that exist in sqlite sees an empty or partial result and concludes the palace has forgotten, when the metadata segment still has the data and only the vector ranking path is degraded. Silent hit-miss is worse than a crash because callers can't detect it. This is especially painful right after `mempalace repair`, mid-HNSW rebuild, or on palaces where MemPalace#823's default `sync_threshold` has left the on-disk HNSW lagging behind sqlite. ## Contract `search_memories()` no longer hard-fails on vector errors; it always returns a result (except "no palace found"). The return dict gains: "warnings": [str, ...] # why we couldn't get more "available_in_scope": int # sqlite-authoritative scope count Behavior: 1. Run the vector query. On exception, add a warning ("vector search unavailable: <err>") and continue. 2. After hybrid ranking, count sqlite drawers matching the scope -- the ceiling of what should have been returned. 3. If `len(hits) < n_results` and the caller did NOT set `max_distance` (strict-similarity mode), top up from the sqlite pool via BM25 keyword ranking. Fallback hits are tagged `matched_via="sqlite_bm25_fallback"`; distance/similarity are `None` since there's no vector score. Drawers with zero query-term overlap are skipped -- no padding with arbitrary content. 4. If scope > returned, add a warning pointing at `mempalace repair`. CLI `search()` now delegates to `search_memories` so both paths share the same fallback and warning surface. Warnings print with a "!" prefix above the results. Extracted the fallback + scope-count into `_sqlite_fallback_and_scope()` to keep `search_memories` under ruff's C901 complexity threshold. ## Closes/relates Sibling to MemPalace#951 (filter-planner fallback). MemPalace#951 catches the specific `Error finding id` and falls back to `col.get` at the query layer; this PR widens the pattern to any vector underdelivery and makes the degradation visible. They compose cleanly. Addresses half of MemPalace#823 (the read-side silent-staleness symptom). The other half -- lowering default `sync_threshold` so drift is less likely -- is orthogonal and can land separately. ## Tests - `test_search_memories_fills_from_sqlite_when_vector_underdelivers` -- mock vector returns 1 hit, sqlite has 4; fallback promotes 2 by BM25; the 1 with no query-term overlap is skipped. - `test_search_memories_query_error_degrades_to_warning` -- vector raises, warning surfaced, no hard failure. - `test_search_query_error_degrades_to_warning` -- CLI no longer raises on vector failure, prints the warning. 4 prior tests updated for the new contract (scope count, warnings list, no `SearchError` on query failure). 974 tests pass, ruff clean.
jphein
added a commit
to techempower-org/mempalace
that referenced
this pull request
Apr 18, 2026
…erdelivers When ChromaDB's HNSW index is sparse, drifted, or rejects a filter (MemPalace#951's "Error finding id"), the current search path returns fewer hits than the palace actually contains -- silently. A user querying for drawers that exist in sqlite sees an empty or partial result and concludes the palace has forgotten, when the metadata segment still has the data and only the vector ranking path is degraded. Silent hit-miss is worse than a crash because callers can't detect it. This is especially painful right after `mempalace repair`, mid-HNSW rebuild, or on palaces where MemPalace#823's default `sync_threshold` has left the on-disk HNSW lagging behind sqlite. `search_memories()` no longer hard-fails on vector errors; it always returns a result (except "no palace found"). The return dict gains: "warnings": [str, ...] # why we couldn't get more "available_in_scope": int # sqlite-authoritative scope count Behavior: 1. Run the vector query. On exception, add a warning ("vector search unavailable: <err>") and continue. 2. After hybrid ranking, count sqlite drawers matching the scope -- the ceiling of what should have been returned. 3. If `len(hits) < n_results` and the caller did NOT set `max_distance` (strict-similarity mode), top up from the sqlite pool via BM25 keyword ranking. Fallback hits are tagged `matched_via="sqlite_bm25_fallback"`; distance/similarity are `None` since there's no vector score. Drawers with zero query-term overlap are skipped -- no padding with arbitrary content. 4. If scope > returned, add a warning pointing at `mempalace repair`. CLI `search()` now delegates to `search_memories` so both paths share the same fallback and warning surface. Warnings print with a "!" prefix above the results. Extracted the fallback + scope-count into `_sqlite_fallback_and_scope()` to keep `search_memories` under ruff's C901 complexity threshold. Sibling to MemPalace#951 (filter-planner fallback). MemPalace#951 catches the specific `Error finding id` and falls back to `col.get` at the query layer; this PR widens the pattern to any vector underdelivery and makes the degradation visible. They compose cleanly. Addresses half of MemPalace#823 (the read-side silent-staleness symptom). The other half -- lowering default `sync_threshold` so drift is less likely -- is orthogonal and can land separately. - `test_search_memories_fills_from_sqlite_when_vector_underdelivers` -- mock vector returns 1 hit, sqlite has 4; fallback promotes 2 by BM25; the 1 with no query-term overlap is skipped. - `test_search_memories_query_error_degrades_to_warning` -- vector raises, warning surfaced, no hard failure. - `test_search_query_error_degrades_to_warning` -- CLI no longer raises on vector failure, prints the warning. 4 prior tests updated for the new contract (scope count, warnings list, no `SearchError` on query failure). 974 tests pass, ruff clean.
jphein
added a commit
to techempower-org/mempalace
that referenced
this pull request
Apr 18, 2026
README: - Fork-changes table: expand the None-metadata row to cover all 8 sites (searcher.py CLI + API + closet-boost, miner.status, 4 mcp_server handlers). Previous row only called out the CLI print path. - Add a new Search row: warnings + sqlite BM25 top-up contract (the "never silent miss" feature) with pointer to MemPalace#951 + MemPalace#823. - Open-PR table: expand MemPalace#999 scope line to mention 8 sites + architectural note, update MemPalace#1000 to reflect post-MemPalace#995 rebase, add MemPalace#1005 with Copilot fixes noted. CLAUDE.md: - PR status header: 7 open -> 8 open (adds MemPalace#1005). - Same PR row updates as README for MemPalace#999/MemPalace#1000/MemPalace#1005. - Fork Changes list: expand entry 11 (None guards) to 8 sites + adapter consolidation proposal on MemPalace#999; add entry 14 for the warnings+BM25 feature; keep 12 and 13 as-is. 42 README-claim tests still pass.
jphein
added a commit
to techempower-org/mempalace
that referenced
this pull request
Apr 19, 2026
…erdelivers When ChromaDB's HNSW index is sparse, drifted, or rejects a filter (MemPalace#951's "Error finding id"), the current search path returns fewer hits than the palace actually contains -- silently. A user querying for drawers that exist in sqlite sees an empty or partial result and concludes the palace has forgotten, when the metadata segment still has the data and only the vector ranking path is degraded. Silent hit-miss is worse than a crash because callers can't detect it. This is especially painful right after `mempalace repair`, mid-HNSW rebuild, or on palaces where MemPalace#823's default `sync_threshold` has left the on-disk HNSW lagging behind sqlite. `search_memories()` no longer hard-fails on vector errors; it always returns a result (except "no palace found"). The return dict gains: "warnings": [str, ...] # why we couldn't get more "available_in_scope": int # sqlite-authoritative scope count Behavior: 1. Run the vector query. On exception, add a warning ("vector search unavailable: <err>") and continue. 2. After hybrid ranking, count sqlite drawers matching the scope -- the ceiling of what should have been returned. 3. If `len(hits) < n_results` and the caller did NOT set `max_distance` (strict-similarity mode), top up from the sqlite pool via BM25 keyword ranking. Fallback hits are tagged `matched_via="sqlite_bm25_fallback"`; distance/similarity are `None` since there's no vector score. Drawers with zero query-term overlap are skipped -- no padding with arbitrary content. 4. If scope > returned, add a warning pointing at `mempalace repair`. CLI `search()` now delegates to `search_memories` so both paths share the same fallback and warning surface. Warnings print with a "!" prefix above the results. Extracted the fallback + scope-count into `_sqlite_fallback_and_scope()` to keep `search_memories` under ruff's C901 complexity threshold. Sibling to MemPalace#951 (filter-planner fallback). MemPalace#951 catches the specific `Error finding id` and falls back to `col.get` at the query layer; this PR widens the pattern to any vector underdelivery and makes the degradation visible. They compose cleanly. Addresses half of MemPalace#823 (the read-side silent-staleness symptom). The other half -- lowering default `sync_threshold` so drift is less likely -- is orthogonal and can land separately. - `test_search_memories_fills_from_sqlite_when_vector_underdelivers` -- mock vector returns 1 hit, sqlite has 4; fallback promotes 2 by BM25; the 1 with no query-term overlap is skipped. - `test_search_memories_query_error_degrades_to_warning` -- vector raises, warning surfaced, no hard failure. - `test_search_query_error_degrades_to_warning` -- CLI no longer raises on vector failure, prints the warning. 4 prior tests updated for the new contract (scope count, warnings list, no `SearchError` on query failure). 974 tests pass, ruff clean.
jphein
added a commit
to techempower-org/mempalace
that referenced
this pull request
Apr 24, 2026
…erdelivers When ChromaDB's HNSW index is sparse, drifted, or rejects a filter (MemPalace#951's "Error finding id"), the current search path returns fewer hits than the palace actually contains -- silently. A user querying for drawers that exist in sqlite sees an empty or partial result and concludes the palace has forgotten, when the metadata segment still has the data and only the vector ranking path is degraded. Silent hit-miss is worse than a crash because callers can't detect it. This is especially painful right after `mempalace repair`, mid-HNSW rebuild, or on palaces where MemPalace#823's default `sync_threshold` has left the on-disk HNSW lagging behind sqlite. `search_memories()` no longer hard-fails on vector errors; it always returns a result (except "no palace found"). The return dict gains: "warnings": [str, ...] # why we couldn't get more "available_in_scope": int # sqlite-authoritative scope count Behavior: 1. Run the vector query. On exception, add a warning ("vector search unavailable: <err>") and continue. 2. After hybrid ranking, count sqlite drawers matching the scope -- the ceiling of what should have been returned. 3. If `len(hits) < n_results` and the caller did NOT set `max_distance` (strict-similarity mode), top up from the sqlite pool via BM25 keyword ranking. Fallback hits are tagged `matched_via="sqlite_bm25_fallback"`; distance/similarity are `None` since there's no vector score. Drawers with zero query-term overlap are skipped -- no padding with arbitrary content. 4. If scope > returned, add a warning pointing at `mempalace repair`. CLI `search()` now delegates to `search_memories` so both paths share the same fallback and warning surface. Warnings print with a "!" prefix above the results. Extracted the fallback + scope-count into `_sqlite_fallback_and_scope()` to keep `search_memories` under ruff's C901 complexity threshold. Sibling to MemPalace#951 (filter-planner fallback). MemPalace#951 catches the specific `Error finding id` and falls back to `col.get` at the query layer; this PR widens the pattern to any vector underdelivery and makes the degradation visible. They compose cleanly. Addresses half of MemPalace#823 (the read-side silent-staleness symptom). The other half -- lowering default `sync_threshold` so drift is less likely -- is orthogonal and can land separately. - `test_search_memories_fills_from_sqlite_when_vector_underdelivers` -- mock vector returns 1 hit, sqlite has 4; fallback promotes 2 by BM25; the 1 with no query-term overlap is skipped. - `test_search_memories_query_error_degrades_to_warning` -- vector raises, warning surfaced, no hard failure. - `test_search_query_error_degrades_to_warning` -- CLI no longer raises on vector failure, prints the warning. 4 prior tests updated for the new contract (scope count, warnings list, no `SearchError` on query failure). 974 tests pass, ruff clean.
jphein
added a commit
to techempower-org/mempalace
that referenced
this pull request
Apr 25, 2026
…erdelivers When ChromaDB's HNSW index is sparse, drifted, or rejects a filter (MemPalace#951's "Error finding id"), the current search path returns fewer hits than the palace actually contains -- silently. A user querying for drawers that exist in sqlite sees an empty or partial result and concludes the palace has forgotten, when the metadata segment still has the data and only the vector ranking path is degraded. Silent hit-miss is worse than a crash because callers can't detect it. This is especially painful right after `mempalace repair`, mid-HNSW rebuild, or on palaces where MemPalace#823's default `sync_threshold` has left the on-disk HNSW lagging behind sqlite. `search_memories()` no longer hard-fails on vector errors; it always returns a result (except "no palace found"). The return dict gains: "warnings": [str, ...] # why we couldn't get more "available_in_scope": int # sqlite-authoritative scope count Behavior: 1. Run the vector query. On exception, add a warning ("vector search unavailable: <err>") and continue. 2. After hybrid ranking, count sqlite drawers matching the scope -- the ceiling of what should have been returned. 3. If `len(hits) < n_results` and the caller did NOT set `max_distance` (strict-similarity mode), top up from the sqlite pool via BM25 keyword ranking. Fallback hits are tagged `matched_via="sqlite_bm25_fallback"`; distance/similarity are `None` since there's no vector score. Drawers with zero query-term overlap are skipped -- no padding with arbitrary content. 4. If scope > returned, add a warning pointing at `mempalace repair`. CLI `search()` now delegates to `search_memories` so both paths share the same fallback and warning surface. Warnings print with a "!" prefix above the results. Extracted the fallback + scope-count into `_sqlite_fallback_and_scope()` to keep `search_memories` under ruff's C901 complexity threshold. Sibling to MemPalace#951 (filter-planner fallback). MemPalace#951 catches the specific `Error finding id` and falls back to `col.get` at the query layer; this PR widens the pattern to any vector underdelivery and makes the degradation visible. They compose cleanly. Addresses half of MemPalace#823 (the read-side silent-staleness symptom). The other half -- lowering default `sync_threshold` so drift is less likely -- is orthogonal and can land separately. - `test_search_memories_fills_from_sqlite_when_vector_underdelivers` -- mock vector returns 1 hit, sqlite has 4; fallback promotes 2 by BM25; the 1 with no query-term overlap is skipped. - `test_search_memories_query_error_degrades_to_warning` -- vector raises, warning surfaced, no hard failure. - `test_search_query_error_degrades_to_warning` -- CLI no longer raises on vector failure, prints the warning. 4 prior tests updated for the new contract (scope count, warnings list, no `SearchError` on query failure). 974 tests pass, ruff clean.
jphein
added a commit
to techempower-org/mempalace
that referenced
this pull request
May 6, 2026
…erdelivers When ChromaDB's HNSW index is sparse, drifted, or rejects a filter (MemPalace#951's "Error finding id"), the current search path returns fewer hits than the palace actually contains -- silently. A user querying for drawers that exist in sqlite sees an empty or partial result and concludes the palace has forgotten, when the metadata segment still has the data and only the vector ranking path is degraded. Silent hit-miss is worse than a crash because callers can't detect it. This is especially painful right after `mempalace repair`, mid-HNSW rebuild, or on palaces where MemPalace#823's default `sync_threshold` has left the on-disk HNSW lagging behind sqlite. `search_memories()` no longer hard-fails on vector errors; it always returns a result (except "no palace found"). The return dict gains: "warnings": [str, ...] # why we couldn't get more "available_in_scope": int # sqlite-authoritative scope count Behavior: 1. Run the vector query. On exception, add a warning ("vector search unavailable: <err>") and continue. 2. After hybrid ranking, count sqlite drawers matching the scope -- the ceiling of what should have been returned. 3. If `len(hits) < n_results` and the caller did NOT set `max_distance` (strict-similarity mode), top up from the sqlite pool via BM25 keyword ranking. Fallback hits are tagged `matched_via="sqlite_bm25_fallback"`; distance/similarity are `None` since there's no vector score. Drawers with zero query-term overlap are skipped -- no padding with arbitrary content. 4. If scope > returned, add a warning pointing at `mempalace repair`. CLI `search()` now delegates to `search_memories` so both paths share the same fallback and warning surface. Warnings print with a "!" prefix above the results. Extracted the fallback + scope-count into `_sqlite_fallback_and_scope()` to keep `search_memories` under ruff's C901 complexity threshold. Sibling to MemPalace#951 (filter-planner fallback). MemPalace#951 catches the specific `Error finding id` and falls back to `col.get` at the query layer; this PR widens the pattern to any vector underdelivery and makes the degradation visible. They compose cleanly. Addresses half of MemPalace#823 (the read-side silent-staleness symptom). The other half -- lowering default `sync_threshold` so drift is less likely -- is orthogonal and can land separately. - `test_search_memories_fills_from_sqlite_when_vector_underdelivers` -- mock vector returns 1 hit, sqlite has 4; fallback promotes 2 by BM25; the 1 with no query-term overlap is skipped. - `test_search_memories_query_error_degrades_to_warning` -- vector raises, warning surfaced, no hard failure. - `test_search_query_error_degrades_to_warning` -- CLI no longer raises on vector failure, prints the warning. 4 prior tests updated for the new contract (scope count, warnings list, no `SearchError` on query failure). 974 tests pass, ruff clean.
jphein
added a commit
to techempower-org/mempalace
that referenced
this pull request
May 6, 2026
When the vector index returns fewer than n_results (sparse HNSW post-repair, MemPalace#951 filter-planner failure, drift), search_memories now: 1. Computes an authoritative scope count via paginated col.get(), surfaced as `available_in_scope` in the response. Caps each query below MemPalace#950's SQL-variable limit. 2. Tops up the hits list with BM25-ranked sqlite candidates tagged `matched_via: "sqlite_bm25_fallback"` when the vector path is under-delivering. Skips candidates with BM25 score 0 so the fallback never pads with unrelated content. 3. Returns `warnings: [...]` describing when fallback fired and when the scope contains more drawers than the vector path can rank (gated on a `vector_underdelivered` flag captured before fallback runs, so the warning surfaces even when BM25 papered over the gap). CLI search() delegates to search_memories() so terminal output and MCP responses share the same retrieval, fallback, and warning semantics. Preserves the palace path in printed errors. Closes the silent 0-hit failure mode where data was in sqlite but the vector path returned nothing — visible to the user via warnings and `available_in_scope`, fixable via `mempalace repair`. Tests: 29/29 pass on rebased branch (Python 3.9 floor honored via Optional[int]). Mock setup updated to set count.return_value so the new "more in scope" warning path doesn't fail on MagicMock comparison. Squashed rebase against current upstream/develop (post-MemPalace#1377). Was filed as 5-commit history; squashed for cleaner review. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
jphein
added a commit
to techempower-org/mempalace
that referenced
this pull request
May 8, 2026
When the vector index returns fewer than n_results (sparse HNSW post-repair, MemPalace#951 filter-planner failure, drift), search_memories now: 1. Computes an authoritative scope count via paginated col.get(), surfaced as `available_in_scope` in the response. Caps each query below MemPalace#950's SQL-variable limit. 2. Tops up the hits list with BM25-ranked sqlite candidates tagged `matched_via: "sqlite_bm25_fallback"` when the vector path is under-delivering. Skips candidates with BM25 score 0 so the fallback never pads with unrelated content. 3. Returns `warnings: [...]` describing when fallback fired and when the scope contains more drawers than the vector path can rank (gated on a `vector_underdelivered` flag captured before fallback runs, so the warning surfaces even when BM25 papered over the gap). CLI search() delegates to search_memories() so terminal output and MCP responses share the same retrieval, fallback, and warning semantics. Preserves the palace path in printed errors. Closes the silent 0-hit failure mode where data was in sqlite but the vector path returned nothing — visible to the user via warnings and `available_in_scope`, fixable via `mempalace repair`. Tests: 29/29 pass on rebased branch (Python 3.9 floor honored via Optional[int]). Mock setup updated to set count.return_value so the new "more in scope" warning path doesn't fail on MagicMock comparison. Squashed rebase against current upstream/develop (post-MemPalace#1377). Was filed as 5-commit history; squashed for cleaner review. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
jphein
added a commit
to techempower-org/mempalace
that referenced
this pull request
May 11, 2026
When the vector index returns fewer than n_results (sparse HNSW post-repair, MemPalace#951 filter-planner failure, drift), search_memories now: 1. Computes an authoritative scope count via paginated col.get(), surfaced as `available_in_scope` in the response. Caps each query below MemPalace#950's SQL-variable limit. 2. Tops up the hits list with BM25-ranked sqlite candidates tagged `matched_via: "sqlite_bm25_fallback"` when the vector path is under-delivering. Skips candidates with BM25 score 0 so the fallback never pads with unrelated content. 3. Returns `warnings: [...]` describing when fallback fired and when the scope contains more drawers than the vector path can rank (gated on a `vector_underdelivered` flag captured before fallback runs, so the warning surfaces even when BM25 papered over the gap). CLI search() delegates to search_memories() so terminal output and MCP responses share the same retrieval, fallback, and warning semantics. Preserves the palace path in printed errors. Closes the silent 0-hit failure mode where data was in sqlite but the vector path returned nothing — visible to the user via warnings and `available_in_scope`, fixable via `mempalace repair`. Tests: 29/29 pass on rebased branch (Python 3.9 floor honored via Optional[int]). Mock setup updated to set count.return_value so the new "more in scope" warning path doesn't fail on MagicMock comparison. Squashed rebase against current upstream/develop (post-MemPalace#1377). Was filed as 5-commit history; squashed for cleaner review. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
jphein
added a commit
to techempower-org/mempalace
that referenced
this pull request
May 11, 2026
When the vector index returns fewer than n_results (sparse HNSW post-repair, MemPalace#951 filter-planner failure, drift), search_memories now: 1. Computes an authoritative scope count via paginated col.get(), surfaced as `available_in_scope` in the response. Caps each query below MemPalace#950's SQL-variable limit. 2. Tops up the hits list with BM25-ranked sqlite candidates tagged `matched_via: "sqlite_bm25_fallback"` when the vector path is under-delivering. Skips candidates with BM25 score 0 so the fallback never pads with unrelated content. 3. Returns `warnings: [...]` describing when fallback fired and when the scope contains more drawers than the vector path can rank (gated on a `vector_underdelivered` flag captured before fallback runs, so the warning surfaces even when BM25 papered over the gap). CLI search() delegates to search_memories() so terminal output and MCP responses share the same retrieval, fallback, and warning semantics. Preserves the palace path in printed errors. Closes the silent 0-hit failure mode where data was in sqlite but the vector path returned nothing — visible to the user via warnings and `available_in_scope`, fixable via `mempalace repair`. Tests: 29/29 pass on rebased branch (Python 3.9 floor honored via Optional[int]). Mock setup updated to set count.return_value so the new "more in scope" warning path doesn't fail on MagicMock comparison. Squashed rebase against current upstream/develop (post-MemPalace#1377). Was filed as 5-commit history; squashed for cleaner review. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
jphein
added a commit
to techempower-org/mempalace
that referenced
this pull request
May 22, 2026
When the vector index returns fewer than n_results (sparse HNSW post-repair, MemPalace#951 filter-planner failure, drift), search_memories now: 1. Computes an authoritative scope count via paginated col.get(), surfaced as `available_in_scope` in the response. Caps each query below MemPalace#950's SQL-variable limit. 2. Tops up the hits list with BM25-ranked sqlite candidates tagged `matched_via: "sqlite_bm25_fallback"` when the vector path is under-delivering. Skips candidates with BM25 score 0 so the fallback never pads with unrelated content. 3. Returns `warnings: [...]` describing when fallback fired and when the scope contains more drawers than the vector path can rank (gated on a `vector_underdelivered` flag captured before fallback runs, so the warning surfaces even when BM25 papered over the gap). CLI search() delegates to search_memories() so terminal output and MCP responses share the same retrieval, fallback, and warning semantics. Preserves the palace path in printed errors. Closes the silent 0-hit failure mode where data was in sqlite but the vector path returned nothing — visible to the user via warnings and `available_in_scope`, fixable via `mempalace repair`. Tests: 29/29 pass on rebased branch (Python 3.9 floor honored via Optional[int]). Mock setup updated to set count.return_value so the new "more in scope" warning path doesn't fail on MagicMock comparison. Squashed rebase against current upstream/develop (post-MemPalace#1377). Was filed as 5-commit history; squashed for cleaner review. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
3 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This fixes a filtered search failure mode on migrated ChromaDB palaces and aligns the repo's dev lockfile with the ChromaDB major version MemPalace already supports in practice.
What changes
search_memories()for filtered drawer queries:Collection.query(where=...)firstget()plus local lexical ranking instead of returning an errortests/test_searcher.pychromadb>=0.5.0tochromadb>=1.0.0uv.lockso the repo/dev environment matches the modern ChromaDB lineWhy
On a migrated palace, filtered Chroma queries can fail with:
Error executing plan: Internal error: Error finding idImportant details from validation:
chromadb 1.5.7databases do not reproduce thisquery(where=...)failsget(where=...)on the same filter can still succeedThat means MemPalace should degrade gracefully instead of surfacing a hard search failure when the filtered query planner hits this state.
Verification
./.venv/bin/python -m pytest tests/test_searcher.pypassesNotes on ChromaDB versions
The repo lockfile had drifted behind the installed/runtime environment:
.venv:chromadb 0.6.3chromadb 1.5.7This PR updates the repo floor/lockfile to the 1.x line so local development and verification reflect the actual supported/runtime environment.