perf: L1 importance pre-filter — skip full scan when enough high-importance drawers exist#660
perf: L1 importance pre-filter — skip full scan when enough high-importance drawers exist#660jphein wants to merge 2 commits into
Conversation
There was a problem hiding this comment.
Pull request overview
Improves Layer1 wake-up performance by introducing a two-phase drawer fetch strategy (importance pre-filter fast-path, with full-scan fallback) and adding a hard cap on scanned drawers.
Changes:
- Add
Layer1._fetch_drawers()to tryimportance >= 3first, falling back to an unfiltered scan when needed - Introduce
MAX_SCAN = 2000to cap how many drawers are iterated during L1 generation - Update Layer1 tests/mocks to accommodate fast-path + fallback behavior and updated
whereshapes
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
| mempalace/layers.py | Adds two-phase fetch (_fetch_drawers) and scan cap (MAX_SCAN), and refactors generate() to use it |
| tests/test_layers.py | Updates Chroma mocks and assertions for the new fast-path/fallback behavior and where composition |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # Fast path: only fetch drawers with importance >= 3 | ||
| importance_filter = {"importance": {"$gte": 3}} | ||
| if self.wing: | ||
| where = {"$and": [{"wing": self.wing}, importance_filter]} | ||
| else: | ||
| where = importance_filter |
There was a problem hiding this comment.
The fast-path filter only considers metadata key importance ({"importance": {"$gte": 3}}), but generate() scores drawers using importance, emotional_weight, or weight. If drawers commonly use emotional_weight/weight (and not importance), the fast path can exclude high-scoring drawers and change L1 output whenever len(docs) >= MAX_DRAWERS. Consider aligning the filter with the scoring keys (e.g., normalize metadata to a single key at write-time, or use a $or over the supported keys) so the optimization doesn’t change semantics.
| # Fast path: only fetch drawers with importance >= 3 | |
| importance_filter = {"importance": {"$gte": 3}} | |
| if self.wing: | |
| where = {"$and": [{"wing": self.wing}, importance_filter]} | |
| else: | |
| where = importance_filter | |
| # Fast path: fetch drawers that meet any supported scoring weight key. | |
| # This must stay aligned with generate(), which scores drawers using | |
| # importance, emotional_weight, or weight. | |
| score_filter = { | |
| "$or": [ | |
| {"importance": {"$gte": 3}}, | |
| {"emotional_weight": {"$gte": 3}}, | |
| {"weight": {"$gte": 3}}, | |
| ] | |
| } | |
| if self.wing: | |
| where = {"$and": [{"wing": self.wing}, score_filter]} | |
| else: | |
| where = score_filter |
There was a problem hiding this comment.
Documented — added comment explaining the design: importance is the primary signal (rarely set without emotional_weight/weight), and the fallback full-scan ensures nothing is missed.
| if len(batch_docs) < _BATCH or len(docs) >= self.MAX_SCAN: | ||
| break | ||
| except Exception: | ||
| return "## L1 — No palace found. Run: mempalace mine <dir>" | ||
| docs, metas = [], [] | ||
|
|
||
| # Fetch all drawers in batches to avoid SQLite variable limit (~999) | ||
| _BATCH = 500 | ||
| # If enough high-importance drawers, use them | ||
| if len(docs) >= self.MAX_DRAWERS: | ||
| return docs, metas |
There was a problem hiding this comment.
Capping the fast-path results (len(docs) >= MAX_SCAN) and then returning them when len(docs) >= MAX_DRAWERS means generate() is no longer guaranteed to select the true top-15 drawers by importance across the whole palace: col.get() order isn’t tied to importance, and the highest-importance drawers may exist beyond the first MAX_SCAN matches. This also contradicts the PR description’s claim that output is unchanged when enough high-importance drawers exist. If exactness matters, avoid the cap on the candidate set (or use a deterministic multi-threshold strategy like querying for importance>=5, then >=4, etc. until 15 are collected).
There was a problem hiding this comment.
Documented — added multi-line comment explaining the MAX_SCAN tradeoff: prevents O(n) scans on 100K+ palaces, accepts approximate top-15. L3 deep search covers full corpus when precision matters.
| assert call_kwargs.get("where") == {"wing": "project_x"} | ||
| where = call_kwargs.get("where", {}) | ||
| assert "$and" in where | ||
| assert {"wing": "project_x"} in where["$and"] |
There was a problem hiding this comment.
This test asserts that the fast-path where contains the wing filter, but it no longer asserts that the importance pre-filter is also present. Adding an assertion for { "importance": {"$gte": 3} } inside the $and list would prevent regressions where the optimization silently stops applying the importance constraint.
| assert {"wing": "project_x"} in where["$and"] | |
| assert {"wing": "project_x"} in where["$and"] | |
| assert {"importance": {"$gte": 3}} in where["$and"] |
There was a problem hiding this comment.
Fixed — added assert {"importance": {"$gte": 3}} in where["$and"] to verify the importance pre-filter is present.
| """Return a mock collection whose get() returns docs/metas. | ||
|
|
||
| Layer1._fetch_drawers() has two phases: a fast-path (importance pre-filter) | ||
| and a fallback full-scan. For small test datasets (< 500 items), each phase | ||
| makes exactly one col.get() call before breaking (len < _BATCH). We | ||
| provide two identical responses: one consumed by the fast path and one by | ||
| the fallback. | ||
| """ | ||
| mock_col = MagicMock() | ||
| # First batch returns data, second batch returns empty (end of pagination) | ||
| mock_col.get.side_effect = [ | ||
| {"documents": docs, "metadatas": metas}, | ||
| {"documents": [], "metadatas": []}, | ||
| {"documents": docs, "metadatas": metas}, # fast-path batch | ||
| {"documents": docs, "metadatas": metas}, # fallback batch (< MAX_DRAWERS → fallback) | ||
| ] |
There was a problem hiding this comment.
The helper docstring says it provides “two identical responses: one consumed by the fast path and one by the fallback”, but several tests pass >= MAX_DRAWERS docs so the fallback isn’t executed and the second response is unused. Consider rewording the docstring/comments to clarify that the second response is only consumed when the fallback path is taken.
There was a problem hiding this comment.
Fixed — updated docstring to accurately state the second response is only consumed when the fast path returns fewer than MAX_DRAWERS results.
| MAX_SCAN = 2000 # don't scan more than this for L1 generation | ||
|
|
There was a problem hiding this comment.
New behavior around pagination/capping isn’t exercised by tests: _fetch_drawers() now has a hard stop at MAX_SCAN and can issue multiple col.get() calls when a batch returns exactly _BATCH items (e.g., 500). Consider adding a unit test that simulates >500 results and verifies the method stops at MAX_SCAN (and/or verifies the number of get() calls) so this performance guardrail doesn’t regress.
There was a problem hiding this comment.
Fixed — added two pagination tests: test_layer1_pagination_stops_at_max_scan (500+100 across 2 pages) and test_layer1_pagination_caps_at_max_scan (full 500-item batches with MAX_SCAN=1200, verifies loop stops after 3 pages).
Addresses Copilot review feedback on MemPalace#660. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Closing duplicate PR created by network timeout. |
3 similar comments
|
Closing duplicate PR created by network timeout. |
|
Closing duplicate PR created by network timeout. |
|
Closing duplicate PR created by network timeout. |
Addresses Copilot review feedback on MemPalace#660. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
cc9eec4 to
e89c0b6
Compare
Addresses Copilot review feedback on MemPalace#660. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Addresses Copilot review feedback on MemPalace#660. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
70a58d9 to
4604a48
Compare
fork-direction.md and TODO-fork-improvements.md were scattered strategic thinking that belongs in the front-door README: competitive landscape, roadmap (P0-P6), and open problems. Merges them in and deletes the standalone files. PR-status tables (README + CLAUDE.md) were stale after today's rebases of MemPalace#659, MemPalace#660, MemPalace#661, MemPalace#673, MemPalace#681 — updated to reflect current mergeable state. MemPalace#673 specifically noted as cleanly rebased against MemPalace#863. test_readme_claims.py skips MCP-tool-table and dialect-reference checks when absent — our slimmed fork README doesn't reproduce upstream's tool table structure. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Friendly ping on review — small perf optimization: skip the full metadata scan in |
Every remaining row in "Still ahead of upstream" now carries a status so the reader can tell at a glance whether each change is being upstreamed, pending a PR, or deliberately fork-local. Dropped: - "Guard ChromaDB 1.5.x metadata-mismatch segfault" — this row was overstated. The memory file for today's debugging notes that the try-get/except-create pattern is defensive code that never reproduced a specific crash (the actual crashes traced to HNSW drift). Leaving it in "Still ahead" implied an upstream-candidate fix, which it isn't. Code stays in place as defensive, but the README no longer claims it as a fork-ahead feature. Moved to Superseded: - "Stale HNSW mtime detection + mempalace_reconnect" — upstream took a different approach in MemPalace#757. Our broader inode+mtime detection and the mempalace_reconnect MCP tool remain as fork-local convenience; they're just not "ahead of upstream" anymore. Statuses now populated: - Linked PR number for the 7 changes with active upstream PRs (MemPalace#659, MemPalace#660, MemPalace#661, MemPalace#673 with APPROVED note, MemPalace#999, MemPalace#1000, MemPalace#1005). - "PR pending" for 3 items that are good candidates but unfiled: epsilon mtime comparison, max_distance parameter, tool output mining. - "fork-only" for 2 items we keep intentionally without pitching upstream: .blob_seq_ids_migrated marker (narrow), bulk_check_mined (complementary to upstream's MemPalace#784 file-locking). Legend sentence added above the table explains the three status values. 42 README-claim tests pass.
Addresses Copilot review feedback on MemPalace#660. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
4604a48 to
8f0b04a
Compare
Addresses Copilot review feedback on MemPalace#660. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
8f0b04a to
9352c70
Compare
…rtance drawers exist
At 100K+ drawers, Layer1.generate() was scanning everything to find the
top 15 by importance. Add a fast path that queries ChromaDB with
`where={"importance": {"$gte": 3}}` first. Only falls back to the
full scan when fewer than MAX_DRAWERS (15) results come back, ensuring
correctness on small or sparse palaces.
- Add `Layer1._fetch_drawers(col)` implementing the two-phase strategy
- Add `Layer1.MAX_SCAN = 2000` to cap fallback scan depth
- Update `test_layer1_*` mocks to cover both fast-path and fallback calls
- Update `test_layer1_with_wing_filter` assertion to match new `$and` filter shape
- Reframe `test_layer1_batch_exception_breaks` to verify fallback recovery
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Addresses Copilot review feedback on MemPalace#660. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
9352c70 to
0752573
Compare
Problem
Layer1.generate()(used inMemoryStack.wake_up()) scans all drawers in the palace to find the top 15 by importance. On a palace with 100K+ drawers this is slow — it paginates through everything in 500-document batches before picking the top handful.Solution
Add a two-phase fetch strategy in a new
Layer1._fetch_drawers(col)method:where={"importance": {"$gte": 3}}(combined with wing filter if set). If >=MAX_DRAWERS(15) results come back, use them directly — no full scan needed.MAX_SCAN = 2000drawers.This is a pure performance improvement — the output of
generate()is unchanged for any palace with at least 15 drawers at importance >= 3.Changes
mempalace/layers.py—Layer1class:MAX_SCAN = 2000class constant_fetch_drawers(col) -> tuplemethod with fast-path + fallback logicgenerate()delegates to_fetch_drawers()instead of inlining the batch looptests/test_layers.py:_mock_chromadb_for_layer()to provide responses for both fast-path and fallback callstest_layer1_with_wing_filterassertion — fast-pathwhereis now{"$and": [{"wing": ...}, {"importance": {"$gte": 3}}]}test_layer1_batch_exception_breaks— exception on fast-path triggers fallback recovery (tests that path)Testing
All 598 tests pass (
python -m pytest tests/ -x -q). The two-phase logic is exercised by both the existing tests (updated mocks) and by thetest_layer1_batch_exception_breakstest which verifies graceful fallback when the fast-path raises.🤖 Generated with Claude Code