docs(searcher): record VecRecall closet-boost ablation finding

jphein · jphein · commit 3cb03f302b1e · 2026-04-27T08:43:26.000-07:00
A/B ablation 2026-04-27 against the 151K canonical palace (12-probe set mixing recent fork-side decisions with mined-file content). Closet boost fires on ~20% of result rows, concentrated in queries with answers in mined files; sparse on chat-transcript queries. When the boost fired, it re-ordered chunks within a single source file rather than displacing right answers with wrong ones. VecRecall's critique (MemPalace#1129 — "organization-layer involvement in retrieval reduces R@5") did not reproduce on this corpus. The hybrid degrades to effectively pure-vector for transcript queries and re-ranks within-file chunks for mined-file queries, neither of which matches the failure mode VecRecall is fixing. Captured as a comment next to the constants so future-us doesn't have to re-run the experiment to learn the boost is mostly inert on chat-heavy corpora. Refs: scratch experiment was /tmp/closet-boost-ab.py (not committed, re-creatable from this comment + the hoist in f558d3c).
diff --git a/mempalace/searcher.py b/mempalace/searcher.py
@@ -164,6 +164,17 @@ def _hybrid_rank(
 # — "which closet matched best for this source" — is more reliable than
 # absolute distance on narrative content, where closet distances cluster
 # in 1.2–1.5 regardless of match quality.
+#
+# Empirical note (A/B ablation 2026-04-27 on the 151K canonical palace,
+# 12-probe set covering recent fork-side work + transcript content):
+# boost fires on ~20% of result rows, concentrated in queries whose
+# answer lives in mined files; closets are sparse on chat-transcript
+# queries (most fork-side decisions). When the boost did fire, it
+# re-ordered chunks within a single source file rather than displacing
+# right answers with wrong ones — i.e., VecRecall's critique
+# (https://github.com/MemPalace/mempalace/discussions/1129, "org-layer
+# in retrieval path drops R@5") didn't reproduce here. Kept as a
+# rare-but-cheap signal; ablation script lived in /tmp, not committed.
 CLOSET_RANK_BOOSTS = [0.40, 0.25, 0.15, 0.08, 0.04]
 CLOSET_DISTANCE_CAP = 1.5  # cosine dist > 1.5 = too weak to use as signal