Skip to content

Commit 3cb03f3

Browse files
committed
docs(searcher): record VecRecall closet-boost ablation finding
A/B ablation 2026-04-27 against the 151K canonical palace (12-probe set mixing recent fork-side decisions with mined-file content). Closet boost fires on ~20% of result rows, concentrated in queries with answers in mined files; sparse on chat-transcript queries. When the boost fired, it re-ordered chunks within a single source file rather than displacing right answers with wrong ones. VecRecall's critique (MemPalace#1129 — "organization-layer involvement in retrieval reduces R@5") did not reproduce on this corpus. The hybrid degrades to effectively pure-vector for transcript queries and re-ranks within-file chunks for mined-file queries, neither of which matches the failure mode VecRecall is fixing. Captured as a comment next to the constants so future-us doesn't have to re-run the experiment to learn the boost is mostly inert on chat-heavy corpora. Refs: scratch experiment was /tmp/closet-boost-ab.py (not committed, re-creatable from this comment + the hoist in f558d3c).
1 parent f558d3c commit 3cb03f3

1 file changed

Lines changed: 11 additions & 0 deletions

File tree

mempalace/searcher.py

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -164,6 +164,17 @@ def _hybrid_rank(
164164
# — "which closet matched best for this source" — is more reliable than
165165
# absolute distance on narrative content, where closet distances cluster
166166
# in 1.2–1.5 regardless of match quality.
167+
#
168+
# Empirical note (A/B ablation 2026-04-27 on the 151K canonical palace,
169+
# 12-probe set covering recent fork-side work + transcript content):
170+
# boost fires on ~20% of result rows, concentrated in queries whose
171+
# answer lives in mined files; closets are sparse on chat-transcript
172+
# queries (most fork-side decisions). When the boost did fire, it
173+
# re-ordered chunks within a single source file rather than displacing
174+
# right answers with wrong ones — i.e., VecRecall's critique
175+
# (https://github.com/MemPalace/mempalace/discussions/1129, "org-layer
176+
# in retrieval path drops R@5") didn't reproduce here. Kept as a
177+
# rare-but-cheap signal; ablation script lived in /tmp, not committed.
167178
CLOSET_RANK_BOOSTS = [0.40, 0.25, 0.15, 0.08, 0.04]
168179
CLOSET_DISTANCE_CAP = 1.5 # cosine dist > 1.5 = too weak to use as signal
169180

0 commit comments

Comments
 (0)