perf: L1 importance pre-filter — skip full scan when enough high-importance drawers exist by jphein · Pull Request #660 · MemPalace/mempalace

jphein · 2026-04-12T02:43:49Z

Problem

Layer1.generate() (used in MemoryStack.wake_up()) scans all drawers in the palace to find the top 15 by importance. On a palace with 100K+ drawers this is slow — it paginates through everything in 500-document batches before picking the top handful.

Solution

Add a two-phase fetch strategy in a new Layer1._fetch_drawers(col) method:

Fast path: query ChromaDB with where={"importance": {"$gte": 3}} (combined with wing filter if set). If >= MAX_DRAWERS (15) results come back, use them directly — no full scan needed.
Fallback: if fewer than 15 high-importance drawers exist (small or sparse palace), fall back to the existing full scan, capped at MAX_SCAN = 2000 drawers.

This is a pure performance improvement — the output of generate() is unchanged for any palace with at least 15 drawers at importance >= 3.

Changes

mempalace/layers.py — Layer1 class:
- Add MAX_SCAN = 2000 class constant
- Add _fetch_drawers(col) -> tuple method with fast-path + fallback logic
- generate() delegates to _fetch_drawers() instead of inlining the batch loop
tests/test_layers.py:
- Update _mock_chromadb_for_layer() to provide responses for both fast-path and fallback calls
- Update test_layer1_with_wing_filter assertion — fast-path where is now {"$and": [{"wing": ...}, {"importance": {"$gte": 3}}]}
- Update test_layer1_batch_exception_breaks — exception on fast-path triggers fallback recovery (tests that path)

Testing

All 598 tests pass (python -m pytest tests/ -x -q). The two-phase logic is exercised by both the existing tests (updated mocks) and by the test_layer1_batch_exception_breaks test which verifies graceful fallback when the fast-path raises.

🤖 Generated with Claude Code

Copilot

Pull request overview

Improves Layer1 wake-up performance by introducing a two-phase drawer fetch strategy (importance pre-filter fast-path, with full-scan fallback) and adding a hard cap on scanned drawers.

Changes:

Add Layer1._fetch_drawers() to try importance >= 3 first, falling back to an unfiltered scan when needed
Introduce MAX_SCAN = 2000 to cap how many drawers are iterated during L1 generation
Update Layer1 tests/mocks to accommodate fast-path + fallback behavior and updated where shapes

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.

File	Description
mempalace/layers.py	Adds two-phase fetch (`_fetch_drawers`) and scan cap (`MAX_SCAN`), and refactors `generate()` to use it
tests/test_layers.py	Updates Chroma mocks and assertions for the new fast-path/fallback behavior and `where` composition

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-12T02:47:36Z

+        # Fast path: only fetch drawers with importance >= 3
+        importance_filter = {"importance": {"$gte": 3}}
+        if self.wing:
+            where = {"$and": [{"wing": self.wing}, importance_filter]}
+        else:
+            where = importance_filter


The fast-path filter only considers metadata key importance ({"importance": {"$gte": 3}}), but generate() scores drawers using importance, emotional_weight, or weight. If drawers commonly use emotional_weight/weight (and not importance), the fast path can exclude high-scoring drawers and change L1 output whenever len(docs) >= MAX_DRAWERS. Consider aligning the filter with the scoring keys (e.g., normalize metadata to a single key at write-time, or use a $or over the supported keys) so the optimization doesn’t change semantics.

Suggested change

# Fast path: only fetch drawers with importance >= 3

importance_filter = {"importance": {"$gte": 3}}

if self.wing:

where = {"$and": [{"wing": self.wing}, importance_filter]}

else:

where = importance_filter

# Fast path: fetch drawers that meet any supported scoring weight key.

# This must stay aligned with generate(), which scores drawers using

# importance, emotional_weight, or weight.

score_filter = {

"$or": [

{"importance": {"$gte": 3}},

{"emotional_weight": {"$gte": 3}},

{"weight": {"$gte": 3}},

]

}

if self.wing:

where = {"$and": [{"wing": self.wing}, score_filter]}

else:

where = score_filter

Documented — added comment explaining the design: importance is the primary signal (rarely set without emotional_weight/weight), and the fallback full-scan ensures nothing is missed.

Copilot · 2026-04-12T02:47:37Z

+                if len(batch_docs) < _BATCH or len(docs) >= self.MAX_SCAN:
+                    break
        except Exception:
-            return "## L1 — No palace found. Run: mempalace mine <dir>"
+            docs, metas = [], []

-        # Fetch all drawers in batches to avoid SQLite variable limit (~999)
-        _BATCH = 500
+        # If enough high-importance drawers, use them
+        if len(docs) >= self.MAX_DRAWERS:
+            return docs, metas


Capping the fast-path results (len(docs) >= MAX_SCAN) and then returning them when len(docs) >= MAX_DRAWERS means generate() is no longer guaranteed to select the true top-15 drawers by importance across the whole palace: col.get() order isn’t tied to importance, and the highest-importance drawers may exist beyond the first MAX_SCAN matches. This also contradicts the PR description’s claim that output is unchanged when enough high-importance drawers exist. If exactness matters, avoid the cap on the candidate set (or use a deterministic multi-threshold strategy like querying for importance>=5, then >=4, etc. until 15 are collected).

Documented — added multi-line comment explaining the MAX_SCAN tradeoff: prevents O(n) scans on 100K+ palaces, accepts approximate top-15. L3 deep search covers full corpus when precision matters.

Copilot · 2026-04-12T02:47:37Z

-    assert call_kwargs.get("where") == {"wing": "project_x"}
+    where = call_kwargs.get("where", {})
+    assert "$and" in where
+    assert {"wing": "project_x"} in where["$and"]


This test asserts that the fast-path where contains the wing filter, but it no longer asserts that the importance pre-filter is also present. Adding an assertion for { "importance": {"$gte": 3} } inside the $and list would prevent regressions where the optimization silently stops applying the importance constraint.

Suggested change

assert {"wing": "project_x"} in where["$and"]

assert {"wing": "project_x"} in where["$and"]

assert {"importance": {"$gte": 3}} in where["$and"]

Fixed — added assert {"importance": {"$gte": 3}} in where["$and"] to verify the importance pre-filter is present.

Copilot · 2026-04-12T02:47:37Z

+    """Return a mock collection whose get() returns docs/metas.
+
+    Layer1._fetch_drawers() has two phases: a fast-path (importance pre-filter)
+    and a fallback full-scan.  For small test datasets (< 500 items), each phase
+    makes exactly one col.get() call before breaking (len < _BATCH).  We
+    provide two identical responses: one consumed by the fast path and one by
+    the fallback.
+    """
    mock_col = MagicMock()
-    # First batch returns data, second batch returns empty (end of pagination)
    mock_col.get.side_effect = [
-        {"documents": docs, "metadatas": metas},
-        {"documents": [], "metadatas": []},
+        {"documents": docs, "metadatas": metas},  # fast-path batch
+        {"documents": docs, "metadatas": metas},  # fallback batch (< MAX_DRAWERS → fallback)
    ]


The helper docstring says it provides “two identical responses: one consumed by the fast path and one by the fallback”, but several tests pass >= MAX_DRAWERS docs so the fallback isn’t executed and the second response is unused. Consider rewording the docstring/comments to clarify that the second response is only consumed when the fallback path is taken.

Fixed — updated docstring to accurately state the second response is only consumed when the fast path returns fewer than MAX_DRAWERS results.

Copilot · 2026-04-12T02:47:37Z

+    MAX_SCAN = 2000  # don't scan more than this for L1 generation



New behavior around pagination/capping isn’t exercised by tests: _fetch_drawers() now has a hard stop at MAX_SCAN and can issue multiple col.get() calls when a batch returns exactly _BATCH items (e.g., 500). Consider adding a unit test that simulates >500 results and verifies the method stops at MAX_SCAN (and/or verifies the number of get() calls) so this performance guardrail doesn’t regress.

Fixed — added two pagination tests: test_layer1_pagination_stops_at_max_scan (500+100 across 2 pages) and test_layer1_pagination_caps_at_max_scan (full 500-item batches with MAX_SCAN=1200, verifies loop stops after 3 pages).

Addresses Copilot review feedback on MemPalace#660. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

JoyciAkira · 2026-04-12T17:45:51Z

Closing duplicate PR created by network timeout.

JoyciAkira · 2026-04-12T17:46:31Z

Closing duplicate PR created by network timeout.

JoyciAkira · 2026-04-12T17:47:36Z

Closing duplicate PR created by network timeout.

JoyciAkira · 2026-04-12T17:48:45Z

Closing duplicate PR created by network timeout.

Addresses Copilot review feedback on MemPalace#660. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fork-direction.md and TODO-fork-improvements.md were scattered strategic thinking that belongs in the front-door README: competitive landscape, roadmap (P0-P6), and open problems. Merges them in and deletes the standalone files. PR-status tables (README + CLAUDE.md) were stale after today's rebases of MemPalace#659, MemPalace#660, MemPalace#661, MemPalace#673, MemPalace#681 — updated to reflect current mergeable state. MemPalace#673 specifically noted as cleanly rebased against MemPalace#863. test_readme_claims.py skips MCP-tool-table and dialect-reference checks when absent — our slimmed fork README doesn't reproduce upstream's tool table structure. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

jphein · 2026-04-18T16:56:41Z

Friendly ping on review — small perf optimization: skip the full metadata scan in _fetch_drawers when importance >= 3 already returns ≥15 results. No conflict with current develop; happy to rebase if the recent backend refactor work (#995) lands first.

Every remaining row in "Still ahead of upstream" now carries a status so the reader can tell at a glance whether each change is being upstreamed, pending a PR, or deliberately fork-local. Dropped: - "Guard ChromaDB 1.5.x metadata-mismatch segfault" — this row was overstated. The memory file for today's debugging notes that the try-get/except-create pattern is defensive code that never reproduced a specific crash (the actual crashes traced to HNSW drift). Leaving it in "Still ahead" implied an upstream-candidate fix, which it isn't. Code stays in place as defensive, but the README no longer claims it as a fork-ahead feature. Moved to Superseded: - "Stale HNSW mtime detection + mempalace_reconnect" — upstream took a different approach in MemPalace#757. Our broader inode+mtime detection and the mempalace_reconnect MCP tool remain as fork-local convenience; they're just not "ahead of upstream" anymore. Statuses now populated: - Linked PR number for the 7 changes with active upstream PRs (MemPalace#659, MemPalace#660, MemPalace#661, MemPalace#673 with APPROVED note, MemPalace#999, MemPalace#1000, MemPalace#1005). - "PR pending" for 3 items that are good candidates but unfiled: epsilon mtime comparison, max_distance parameter, tool output mining. - "fork-only" for 2 items we keep intentionally without pitching upstream: .blob_seq_ids_migrated marker (narrow), bulk_check_mined (complementary to upstream's MemPalace#784 file-locking). Legend sentence added above the table explains the three status values. 42 README-claim tests pass.

Addresses Copilot review feedback on MemPalace#660. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…rtance drawers exist At 100K+ drawers, Layer1.generate() was scanning everything to find the top 15 by importance. Add a fast path that queries ChromaDB with `where={"importance": {"$gte": 3}}` first. Only falls back to the full scan when fewer than MAX_DRAWERS (15) results come back, ensuring correctness on small or sparse palaces. - Add `Layer1._fetch_drawers(col)` implementing the two-phase strategy - Add `Layer1.MAX_SCAN = 2000` to cap fallback scan depth - Update `test_layer1_*` mocks to cover both fast-path and fallback calls - Update `test_layer1_with_wing_filter` assertion to match new `$and` filter shape - Reframe `test_layer1_batch_exception_breaks` to verify fallback recovery Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Addresses Copilot review feedback on MemPalace#660. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

jphein requested review from bensig, igorls and milla-jovovich as code owners April 12, 2026 02:43

Copilot AI review requested due to automatic review settings April 12, 2026 02:43

Copilot started reviewing on behalf of jphein April 12, 2026 02:44 View session

Copilot AI reviewed Apr 12, 2026

View reviewed changes

jphein added a commit to techempower-org/mempalace that referenced this pull request Apr 12, 2026

fix: document pre-filter tradeoffs, improve test assertions and coverage

4975343

Addresses Copilot review feedback on MemPalace#660. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

jphein added a commit to techempower-org/mempalace that referenced this pull request Apr 12, 2026

fix: document pre-filter tradeoffs, improve test assertions and coverage

cc9eec4

Addresses Copilot review feedback on MemPalace#660. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

jphein force-pushed the perf/l1-importance-prefilter branch 2 times, most recently from cc9eec4 to e89c0b6 Compare April 13, 2026 14:22

jphein added a commit to techempower-org/mempalace that referenced this pull request Apr 13, 2026

fix: document pre-filter tradeoffs, improve test assertions and coverage

e89c0b6

Addresses Copilot review feedback on MemPalace#660. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

igorls added area/search Search and retrieval performance Performance improvements labels Apr 14, 2026

jphein added a commit to techempower-org/mempalace that referenced this pull request Apr 16, 2026

fix: document pre-filter tradeoffs, improve test assertions and coverage

4604a48

Addresses Copilot review feedback on MemPalace#660. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

jphein force-pushed the perf/l1-importance-prefilter branch from 70a58d9 to 4604a48 Compare April 16, 2026 16:01

jphein added a commit to techempower-org/mempalace that referenced this pull request Apr 24, 2026

fix: document pre-filter tradeoffs, improve test assertions and coverage

8f0b04a

Addresses Copilot review feedback on MemPalace#660. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

jphein force-pushed the perf/l1-importance-prefilter branch from 4604a48 to 8f0b04a Compare April 24, 2026 21:25

jphein added a commit to techempower-org/mempalace that referenced this pull request May 8, 2026

fix: document pre-filter tradeoffs, improve test assertions and coverage

9352c70

Addresses Copilot review feedback on MemPalace#660. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

jphein force-pushed the perf/l1-importance-prefilter branch from 8f0b04a to 9352c70 Compare May 8, 2026 21:38

jphein and others added 2 commits May 10, 2026 18:11

fix: document pre-filter tradeoffs, improve test assertions and coverage

0752573

Addresses Copilot review feedback on MemPalace#660. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

jphein force-pushed the perf/l1-importance-prefilter branch from 9352c70 to 0752573 Compare May 11, 2026 01:11

-        # Fast path: only fetch drawers with importance >= 3
-        importance_filter = {"importance": {"$gte": 3}}
-        if self.wing:
-            where = {"$and": [{"wing": self.wing}, importance_filter]}
-        else:
-            where = importance_filter
+        # Fast path: fetch drawers that meet any supported scoring weight key.
+        # This must stay aligned with generate(), which scores drawers using
+        # importance, emotional_weight, or weight.
+        score_filter = {
+            "$or": [
+                {"importance": {"$gte": 3}},
+                {"emotional_weight": {"$gte": 3}},
+                {"weight": {"$gte": 3}},
+            ]
+        }
+        if self.wing:
+            where = {"$and": [{"wing": self.wing}, score_filter]}
+        else:
+            where = score_filter

	assert {"wing": "project_x"} in where["$and"]
	assert {"wing": "project_x"} in where["$and"]
	assert {"importance": {"$gte": 3}} in where["$and"]

		MAX_SCAN = 2000 # don't scan more than this for L1 generation

Conversation

jphein commented Apr 12, 2026

Problem

Solution

Changes

Testing

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 12, 2026

Choose a reason for hiding this comment

Uh oh!

jphein Apr 12, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 12, 2026

Choose a reason for hiding this comment

Uh oh!

jphein Apr 12, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 12, 2026

Choose a reason for hiding this comment

Uh oh!

jphein Apr 12, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 12, 2026

Choose a reason for hiding this comment

Uh oh!

jphein Apr 12, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 12, 2026

Choose a reason for hiding this comment

Uh oh!

jphein Apr 12, 2026

Choose a reason for hiding this comment

Uh oh!

JoyciAkira commented Apr 12, 2026

Uh oh!

JoyciAkira commented Apr 12, 2026

Uh oh!

JoyciAkira commented Apr 12, 2026

Uh oh!

JoyciAkira commented Apr 12, 2026

Uh oh!

jphein commented Apr 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants