Skip to content

perf: L1 importance pre-filter — skip full scan when enough high-importance drawers exist#660

Open
jphein wants to merge 2 commits into
MemPalace:developfrom
techempower-org:perf/l1-importance-prefilter
Open

perf: L1 importance pre-filter — skip full scan when enough high-importance drawers exist#660
jphein wants to merge 2 commits into
MemPalace:developfrom
techempower-org:perf/l1-importance-prefilter

Conversation

@jphein

@jphein jphein commented Apr 12, 2026

Copy link
Copy Markdown
Collaborator

Problem

Layer1.generate() (used in MemoryStack.wake_up()) scans all drawers in the palace to find the top 15 by importance. On a palace with 100K+ drawers this is slow — it paginates through everything in 500-document batches before picking the top handful.

Solution

Add a two-phase fetch strategy in a new Layer1._fetch_drawers(col) method:

  1. Fast path: query ChromaDB with where={"importance": {"$gte": 3}} (combined with wing filter if set). If >= MAX_DRAWERS (15) results come back, use them directly — no full scan needed.
  2. Fallback: if fewer than 15 high-importance drawers exist (small or sparse palace), fall back to the existing full scan, capped at MAX_SCAN = 2000 drawers.

This is a pure performance improvement — the output of generate() is unchanged for any palace with at least 15 drawers at importance >= 3.

Changes

  • mempalace/layers.pyLayer1 class:

    • Add MAX_SCAN = 2000 class constant
    • Add _fetch_drawers(col) -> tuple method with fast-path + fallback logic
    • generate() delegates to _fetch_drawers() instead of inlining the batch loop
  • tests/test_layers.py:

    • Update _mock_chromadb_for_layer() to provide responses for both fast-path and fallback calls
    • Update test_layer1_with_wing_filter assertion — fast-path where is now {"$and": [{"wing": ...}, {"importance": {"$gte": 3}}]}
    • Update test_layer1_batch_exception_breaks — exception on fast-path triggers fallback recovery (tests that path)

Testing

All 598 tests pass (python -m pytest tests/ -x -q). The two-phase logic is exercised by both the existing tests (updated mocks) and by the test_layer1_batch_exception_breaks test which verifies graceful fallback when the fast-path raises.

🤖 Generated with Claude Code

Copilot AI review requested due to automatic review settings April 12, 2026 02:43

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Improves Layer1 wake-up performance by introducing a two-phase drawer fetch strategy (importance pre-filter fast-path, with full-scan fallback) and adding a hard cap on scanned drawers.

Changes:

  • Add Layer1._fetch_drawers() to try importance >= 3 first, falling back to an unfiltered scan when needed
  • Introduce MAX_SCAN = 2000 to cap how many drawers are iterated during L1 generation
  • Update Layer1 tests/mocks to accommodate fast-path + fallback behavior and updated where shapes

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.

File Description
mempalace/layers.py Adds two-phase fetch (_fetch_drawers) and scan cap (MAX_SCAN), and refactors generate() to use it
tests/test_layers.py Updates Chroma mocks and assertions for the new fast-path/fallback behavior and where composition

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread mempalace/layers.py Outdated
Comment on lines +99 to +104
# Fast path: only fetch drawers with importance >= 3
importance_filter = {"importance": {"$gte": 3}}
if self.wing:
where = {"$and": [{"wing": self.wing}, importance_filter]}
else:
where = importance_filter

Copilot AI Apr 12, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fast-path filter only considers metadata key importance ({"importance": {"$gte": 3}}), but generate() scores drawers using importance, emotional_weight, or weight. If drawers commonly use emotional_weight/weight (and not importance), the fast path can exclude high-scoring drawers and change L1 output whenever len(docs) >= MAX_DRAWERS. Consider aligning the filter with the scoring keys (e.g., normalize metadata to a single key at write-time, or use a $or over the supported keys) so the optimization doesn’t change semantics.

Suggested change
# Fast path: only fetch drawers with importance >= 3
importance_filter = {"importance": {"$gte": 3}}
if self.wing:
where = {"$and": [{"wing": self.wing}, importance_filter]}
else:
where = importance_filter
# Fast path: fetch drawers that meet any supported scoring weight key.
# This must stay aligned with generate(), which scores drawers using
# importance, emotional_weight, or weight.
score_filter = {
"$or": [
{"importance": {"$gte": 3}},
{"emotional_weight": {"$gte": 3}},
{"weight": {"$gte": 3}},
]
}
if self.wing:
where = {"$and": [{"wing": self.wing}, score_filter]}
else:
where = score_filter

Copilot uses AI. Check for mistakes.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Documented — added comment explaining the design: importance is the primary signal (rarely set without emotional_weight/weight), and the fallback full-scan ensures nothing is missed.

Comment thread mempalace/layers.py
Comment on lines +124 to +131
if len(batch_docs) < _BATCH or len(docs) >= self.MAX_SCAN:
break
except Exception:
return "## L1 — No palace found. Run: mempalace mine <dir>"
docs, metas = [], []

# Fetch all drawers in batches to avoid SQLite variable limit (~999)
_BATCH = 500
# If enough high-importance drawers, use them
if len(docs) >= self.MAX_DRAWERS:
return docs, metas

Copilot AI Apr 12, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Capping the fast-path results (len(docs) >= MAX_SCAN) and then returning them when len(docs) >= MAX_DRAWERS means generate() is no longer guaranteed to select the true top-15 drawers by importance across the whole palace: col.get() order isn’t tied to importance, and the highest-importance drawers may exist beyond the first MAX_SCAN matches. This also contradicts the PR description’s claim that output is unchanged when enough high-importance drawers exist. If exactness matters, avoid the cap on the candidate set (or use a deterministic multi-threshold strategy like querying for importance>=5, then >=4, etc. until 15 are collected).

Copilot uses AI. Check for mistakes.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Documented — added multi-line comment explaining the MAX_SCAN tradeoff: prevents O(n) scans on 100K+ palaces, accepts approximate top-15. L3 deep search covers full corpus when precision matters.

Comment thread tests/test_layers.py
assert call_kwargs.get("where") == {"wing": "project_x"}
where = call_kwargs.get("where", {})
assert "$and" in where
assert {"wing": "project_x"} in where["$and"]

Copilot AI Apr 12, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test asserts that the fast-path where contains the wing filter, but it no longer asserts that the importance pre-filter is also present. Adding an assertion for { "importance": {"$gte": 3} } inside the $and list would prevent regressions where the optimization silently stops applying the importance constraint.

Suggested change
assert {"wing": "project_x"} in where["$and"]
assert {"wing": "project_x"} in where["$and"]
assert {"importance": {"$gte": 3}} in where["$and"]

Copilot uses AI. Check for mistakes.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — added assert {"importance": {"$gte": 3}} in where["$and"] to verify the importance pre-filter is present.

Comment thread tests/test_layers.py
Comment on lines +74 to 86
"""Return a mock collection whose get() returns docs/metas.

Layer1._fetch_drawers() has two phases: a fast-path (importance pre-filter)
and a fallback full-scan. For small test datasets (< 500 items), each phase
makes exactly one col.get() call before breaking (len < _BATCH). We
provide two identical responses: one consumed by the fast path and one by
the fallback.
"""
mock_col = MagicMock()
# First batch returns data, second batch returns empty (end of pagination)
mock_col.get.side_effect = [
{"documents": docs, "metadatas": metas},
{"documents": [], "metadatas": []},
{"documents": docs, "metadatas": metas}, # fast-path batch
{"documents": docs, "metadatas": metas}, # fallback batch (< MAX_DRAWERS → fallback)
]

Copilot AI Apr 12, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The helper docstring says it provides “two identical responses: one consumed by the fast path and one by the fallback”, but several tests pass >= MAX_DRAWERS docs so the fallback isn’t executed and the second response is unused. Consider rewording the docstring/comments to clarify that the second response is only consumed when the fallback path is taken.

Copilot uses AI. Check for mistakes.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — updated docstring to accurately state the second response is only consumed when the fast path returns fewer than MAX_DRAWERS results.

Comment thread mempalace/layers.py Outdated
Comment on lines 84 to 85
MAX_SCAN = 2000 # don't scan more than this for L1 generation

Copilot AI Apr 12, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New behavior around pagination/capping isn’t exercised by tests: _fetch_drawers() now has a hard stop at MAX_SCAN and can issue multiple col.get() calls when a batch returns exactly _BATCH items (e.g., 500). Consider adding a unit test that simulates >500 results and verifies the method stops at MAX_SCAN (and/or verifies the number of get() calls) so this performance guardrail doesn’t regress.

Copilot uses AI. Check for mistakes.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — added two pagination tests: test_layer1_pagination_stops_at_max_scan (500+100 across 2 pages) and test_layer1_pagination_caps_at_max_scan (full 500-item batches with MAX_SCAN=1200, verifies loop stops after 3 pages).

jphein added a commit to techempower-org/mempalace that referenced this pull request Apr 12, 2026
Addresses Copilot review feedback on MemPalace#660.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@JoyciAkira

Copy link
Copy Markdown

Closing duplicate PR created by network timeout.

3 similar comments
@JoyciAkira

Copy link
Copy Markdown

Closing duplicate PR created by network timeout.

@JoyciAkira

Copy link
Copy Markdown

Closing duplicate PR created by network timeout.

@JoyciAkira

Copy link
Copy Markdown

Closing duplicate PR created by network timeout.

jphein added a commit to techempower-org/mempalace that referenced this pull request Apr 12, 2026
Addresses Copilot review feedback on MemPalace#660.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@jphein jphein force-pushed the perf/l1-importance-prefilter branch 2 times, most recently from cc9eec4 to e89c0b6 Compare April 13, 2026 14:22
jphein added a commit to techempower-org/mempalace that referenced this pull request Apr 13, 2026
Addresses Copilot review feedback on MemPalace#660.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@igorls igorls added area/search Search and retrieval performance Performance improvements labels Apr 14, 2026
jphein added a commit to techempower-org/mempalace that referenced this pull request Apr 16, 2026
Addresses Copilot review feedback on MemPalace#660.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@jphein jphein force-pushed the perf/l1-importance-prefilter branch from 70a58d9 to 4604a48 Compare April 16, 2026 16:01
jphein added a commit to techempower-org/mempalace that referenced this pull request Apr 16, 2026
fork-direction.md and TODO-fork-improvements.md were scattered strategic
thinking that belongs in the front-door README: competitive landscape,
roadmap (P0-P6), and open problems. Merges them in and deletes the
standalone files.

PR-status tables (README + CLAUDE.md) were stale after today's rebases
of MemPalace#659, MemPalace#660, MemPalace#661, MemPalace#673, MemPalace#681 — updated to reflect current mergeable
state. MemPalace#673 specifically noted as cleanly rebased against MemPalace#863.

test_readme_claims.py skips MCP-tool-table and dialect-reference checks
when absent — our slimmed fork README doesn't reproduce upstream's tool
table structure.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@jphein

jphein commented Apr 18, 2026

Copy link
Copy Markdown
Collaborator Author

Friendly ping on review — small perf optimization: skip the full metadata scan in _fetch_drawers when importance >= 3 already returns ≥15 results. No conflict with current develop; happy to rebase if the recent backend refactor work (#995) lands first.

jphein added a commit to techempower-org/mempalace that referenced this pull request Apr 18, 2026
Every remaining row in "Still ahead of upstream" now carries a status
so the reader can tell at a glance whether each change is being
upstreamed, pending a PR, or deliberately fork-local.

Dropped:
- "Guard ChromaDB 1.5.x metadata-mismatch segfault" — this row was
  overstated. The memory file for today's debugging notes that the
  try-get/except-create pattern is defensive code that never
  reproduced a specific crash (the actual crashes traced to HNSW
  drift). Leaving it in "Still ahead" implied an upstream-candidate
  fix, which it isn't. Code stays in place as defensive, but the
  README no longer claims it as a fork-ahead feature.

Moved to Superseded:
- "Stale HNSW mtime detection + mempalace_reconnect" — upstream took
  a different approach in MemPalace#757. Our broader inode+mtime detection
  and the mempalace_reconnect MCP tool remain as fork-local
  convenience; they're just not "ahead of upstream" anymore.

Statuses now populated:
- Linked PR number for the 7 changes with active upstream PRs
  (MemPalace#659, MemPalace#660, MemPalace#661, MemPalace#673 with APPROVED note, MemPalace#999, MemPalace#1000, MemPalace#1005).
- "PR pending" for 3 items that are good candidates but unfiled:
  epsilon mtime comparison, max_distance parameter, tool output
  mining.
- "fork-only" for 2 items we keep intentionally without pitching
  upstream: .blob_seq_ids_migrated marker (narrow), bulk_check_mined
  (complementary to upstream's MemPalace#784 file-locking).

Legend sentence added above the table explains the three status
values. 42 README-claim tests pass.
jphein added a commit to techempower-org/mempalace that referenced this pull request Apr 24, 2026
Addresses Copilot review feedback on MemPalace#660.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@jphein jphein force-pushed the perf/l1-importance-prefilter branch from 4604a48 to 8f0b04a Compare April 24, 2026 21:25
jphein added a commit to techempower-org/mempalace that referenced this pull request May 8, 2026
Addresses Copilot review feedback on MemPalace#660.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@jphein jphein force-pushed the perf/l1-importance-prefilter branch from 8f0b04a to 9352c70 Compare May 8, 2026 21:38
jphein and others added 2 commits May 10, 2026 18:11
…rtance drawers exist

At 100K+ drawers, Layer1.generate() was scanning everything to find the
top 15 by importance.  Add a fast path that queries ChromaDB with
`where={"importance": {"$gte": 3}}` first.  Only falls back to the
full scan when fewer than MAX_DRAWERS (15) results come back, ensuring
correctness on small or sparse palaces.

- Add `Layer1._fetch_drawers(col)` implementing the two-phase strategy
- Add `Layer1.MAX_SCAN = 2000` to cap fallback scan depth
- Update `test_layer1_*` mocks to cover both fast-path and fallback calls
- Update `test_layer1_with_wing_filter` assertion to match new `$and` filter shape
- Reframe `test_layer1_batch_exception_breaks` to verify fallback recovery

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Addresses Copilot review feedback on MemPalace#660.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@jphein jphein force-pushed the perf/l1-importance-prefilter branch from 9352c70 to 0752573 Compare May 11, 2026 01:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/search Search and retrieval performance Performance improvements

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants