Skip to content

v0.35.6.0 feat(search): floor-ratio gate for metadata boost stages (closes #1091)#1129

Merged
garrytan merged 2 commits into
masterfrom
garrytan/floor-ratio-gate
May 17, 2026
Merged

v0.35.6.0 feat(search): floor-ratio gate for metadata boost stages (closes #1091)#1129
garrytan merged 2 commits into
masterfrom
garrytan/floor-ratio-gate

Conversation

@garrytan

Copy link
Copy Markdown
Owner

Summary

  • Opt-in score-based gate on the three metadata-axis boost stages (backlink, salience, recency) inside runPostFusionStages. Default off; bit-for-bit prior behavior preserved.
  • Built on @jayzalowitz's contribution in feat(search): opt-in floor-ratio gate for post-fusion boost stages #1091 (SkyTwin twin-memory layer). Integration refactored on top per a two-stage review pass (/plan-eng-review + /codex outside voice).
  • Three correctness fixes landed alongside the feature: cache contamination prevention (knobsHash 2→3), NaN scores skip the boost instead of bypassing the gate, negative top scores leave the gate disabled.

What this lets you do

# Per-call (single search):
gbrain query "..." --floor-ratio 0.85

# Operator default:
gbrain config set search.floor_ratio 0.85

Stops weak-overlap candidates with high metadata signal from leapfrogging the legitimate primary hit. Most useful on dense-embedder corpora (text-embedding-3-large, Voyage 3+, zembed-1).

What changed from #1091

Two architectural decisions reversed during codex outside-voice review:

PR #1091 shape Integrated shape
Threshold composition Per-stage recompute (stage order is API) Single up-front at runPostFusionStages entry (order-independent)
Public surface PostFusionOpts.floorRatio only SearchOpts.floorRatio + search.floor_ratio config + MODE_BUNDLES.floor_ratio

Three P1 correctness bugs caught and fixed:

  • Cache contamination via knobsHash() — KNOBS_HASH_VERSION bumped 2→3. Without this, a no-floor cache write would be served to a floor-enabled lookup. Same bug class CDX-4 v0.32.3 hotfix closed for the other search-lite knobs.
  • NaN scores bypass the gateNaN < threshold is false in JS, so NaN-scored rows would have skipped the gate and received boosts. Realistic on embedding-dim drift across reindexes. Fixed: NaN scores skip the boost entirely.
  • Negative top scores break "single result trivially eligible" — top = -0.5 → threshold = -0.425 → top itself fails its own floor. Fixed: computeFloorThreshold returns -Infinity (no gate) on no-positive-signal inputs.

No GBRAIN_SEARCH_FLOOR_RATIO env var — resolveSearchMode() is pure by design; a hidden env knob would make gbrain search modes lie about the resolved state. Use the search.floor_ratio config key instead.

Scope

  • Gates the three metadata-axis boost stages (backlink, salience, recency). Exact-match boost runs independently as a lexical-relevance signal; explicitly NOT gated by design.
  • Single global threshold across all sources. Federated-read users (v0.34.1.0+) sharing a query across multiple sources get one floor. Per-source threshold deferred to v0.36 if real federated-read usage surfaces the suppression.
  • MODE_BUNDLES[*].floor_ratio stays undefined for all three modes pending gbrain-side ablation (filed as TODO).

Test plan

  • bun run verify — clean (typecheck + privacy + jsonb + progress + wasm + test-isolation guards)
  • bun run test — 6753 pass / 0 fail (full parallel unit suite)
  • bun test test/search.test.ts test/search-mode.test.ts test/search/knobs-hash-reranker.test.ts — all targeted tests pass
  • 30+ new tests covering computeFloorThreshold edge cases (NaN, negative top, out-of-range), all three boost-function gate behaviors, runPostFusionStages single-baseline composition, knobsHash cache-contamination prevention, config-key parsing
  • T6 IRON RULE regression test: applyRecencyBoost floor-gate parity with backlink + salience (the modified function shipped in feat(search): opt-in floor-ratio gate for post-fusion boost stages #1091 with zero new-param test coverage)

Review trail

Plan + 9-decision (D1-D9) review trail at ~/.claude/plans/swift-sniffing-nygaard.md. CHANGELOG section has the full Itemized changes breakdown plus a "Mid-deploy cache note" about the knobsHash bump's temporary cache-row doubling.

Credits

Empirical motivation, failure-mode framing, dense-embedder targeting, and the 0.85 starting value all from @jayzalowitz's labeled-retrieval ablation in PR #1091 and skytwin/pull/272. Integration shape is gbrain-side.

This PR supersedes #1091. Closing #1091 in favor of this branch with @jayzalowitz attribution preserved via Co-Authored-By on the commit.

Closes #1091

🤖 Generated with Claude Code

Opt-in score-based gate on the three metadata-axis boost stages (backlink,
salience, recency) inside `runPostFusionStages`. When `SearchOpts.floorRatio`
or `search.floor_ratio` config is set, each stage skips results whose
post-cosine-rescore score is below `floorRatio * topScore`. Default
undefined preserves prior behavior bit-for-bit. Prevents weak-overlap
candidates from accumulating metadata boosts and leapfrogging the
legitimate primary hit on dense-embedder corpora.

Built on the contributor PR from @jayzalowitz (PR #1091, SkyTwin
twin-memory layer). Refactored on top: threshold is computed ONCE at
runPostFusionStages entry instead of per-stage (single-baseline semantic,
order-independent); knobsHash bumped 2->3 so a no-floor cache write can't
be served to a floor-enabled lookup; NaN scores skip the boost instead of
bypassing the gate; SearchOpts/config/MODE_BUNDLES integration replaces
the PR's PostFusionOpts-only surface; no env var (resolveSearchMode is
pure by design).

Three correctness issues codex outside-voice review caught and this
landed with fixed:
- Cache contamination via knobsHash() (same bug class as v0.32.3 CDX-4
  hotfix for the other search-lite knobs)
- NaN scores would have bypassed the gate (NaN < threshold is false in
  JS); realistic on Voyage flexible-dim / zembed-1 Matryoshka dim drift
- Negative top scores would have broken the "single result trivially
  eligible" claim; gate now disables on no-positive-signal inputs

Scope: gates metadata stages only. Exact-match boost
(applyExactMatchBoost) runs independently as a lexical-relevance signal
by design. Cross-source floor stays global (per-source deferred to
v0.36 if federated-read users hit the suppression). Default-on for any
mode bundle deferred until gbrain-side ablation against longmemeval /
whoknows / suspected-contradictions / BrainBench-Real (TODOS.md).

Plan + 9-decision review trail (D1-D9): ~/.claude/plans/swift-sniffing-nygaard.md.
Empirical motivation, failure-mode framing, dense-embedder targeting, and
the 0.85 starting value all from @jayzalowitz's labeled-retrieval
ablation. Integration shape is gbrain-side.

Test surface: 30+ new cases (computeFloorThreshold edge cases including
T1a NaN / T1b negative top, three boost-function gate parity tests
including T6 IRON-RULE applyRecencyBoost regression, runPostFusionStages
single-baseline composition pin, KNOBS_HASH_VERSION bump from 2 to 3,
floor-ratio-changes-hash cache-contamination prevention,
loadOverridesFromConfig coverage for search.floor_ratio config key).
bun run verify clean; full unit suite 6753 pass / 0 fail.

Co-Authored-By: Jay Zalowitz <jayzalowitz@gmail.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…n CLAUDE.md

CHANGELOG entry for v0.35.6.0 was readable only by someone who already
understood gbrain's internals (RRF, knobsHash, MODE_BUNDLES, runPostFusionStages,
Matryoshka, CDX-4). Rewrote it so the first ~150 words explain what
shipped in everyday English, with a concrete worked example, before any
file paths or function names appear. Itemized changes section keeps the
technical precision for engineers who need it.

Then codified the rule in CLAUDE.md so future release entries land the same
way. The "Release-summary template" section now has an iron rule:
"lead ELI10, get precise after." No file paths or internal constants in
the first 150 words; user-visible behavior change first; everyday-language
column headers in any tables. Technical precision is required (the entry
is still the technical record) but lives BELOW the plain-English lead,
never before it.

Smell test: if a reader who has never opened gbrain can walk away from
the first 150 words knowing what shipped and whether they care, the entry
passes.

bun run build:llms regenerated to pick up the CLAUDE.md change (CI guard
test/build-llms.test.ts pins committed bundles against fresh generator
output).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@garrytan garrytan merged commit af7e537 into master May 17, 2026
7 checks passed
garrytan added a commit that referenced this pull request May 17, 2026
Master shipped v0.35.6.0 (PR #1129 — floor-ratio gate for metadata boost
stages) between this branch's first push and now, colliding with our
v0.35.6.0 slot. Rebump to v0.35.7.0:

- VERSION + package.json: 0.35.6.0 → 0.35.7.0
- CHANGELOG: my temporal-trajectory + founder scorecard entry stays at
  the top, header rewritten to [0.35.7.0]; master's [0.35.6.0]
  floor-ratio entry preserved below it; internal references in my body
  ("v0.35.6 fixes" / "v0.35.6 batches" / "To take advantage of v0.35.6.0")
  rewritten to v0.35.7
- skills/migrations/v0.35.6.md renamed → v0.35.7.md, frontmatter +
  heading rewritten
- llms-full.txt regenerated
- bun.lock fresh (no dependency drift, just pin sync)

Verified: bun run typecheck clean, 356 wave + search + bootstrap tests
pass on the merged tree, no regression introduced by the merge.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@jayzalowitz

Copy link
Copy Markdown
Contributor

thanks @garrytan, ive updated skytwin to stay in sync here jayzalowitz/skytwin#334

brandonlipman added a commit to brandonlipman/gbrain that referenced this pull request May 29, 2026
* upstream/master:
  v0.37.0.0 feat(skillpack): registry cathedral — third-party publish + install + 10/10 quality bar (garrytan#1208)
  v0.36.6.0 feat: cross-modal search wave (text↔image + unified column + LLM intent) (garrytan#1165)
  v0.36.5.0 feat: secure DATABASE_URL access for shell jobs (inherit: ["database_url"]) (garrytan#1192)
  v0.36.4.0 feat: brain-health-100 — autonomous remediation via doctor --remediate + Minions (garrytan#1193)
  fix(docs): comprehensive drift audit — contradictions, broken links, stale refs (garrytan#1201)
  v0.36.3.0 feat: dynamic embedding column selection for search (garrytan#1164)
  v0.36.2.0 feat: ZeroEntropy as default + zero-based README rewrite (garrytan#1136)
  v0.36.1.1 fix-wave: community PR triage + 28 atomic fixes (garrytan#1182)
  v0.36.1.0 Hindsight calibration wave: brain learns how you tend to be wrong (garrytan#1139)
  v0.36.0.0 feat(skillpack): scaffold + reference + harvest (retire managed-block install) (garrytan#1130)
  v0.35.8.0 feat(cycle): phantom-page redirect inside extract_facts (garrytan#1138)
  v0.35.7.0 feat: temporal trajectory + founder scorecard (Phases 2-4) (garrytan#1131)
  v0.35.6.0 feat(search): floor-ratio gate for metadata boost stages (closes garrytan#1091) (garrytan#1129)
  v0.35.5.1 fix(doctor): stop counting clean supervisor exits as crashes (garrytan#1108)
  v0.35.5.0 fix wave: bootstrap + orphans + think MCP + worktree + walker (garrytan#1111)
  v0.35.4.0 fix(doctor,entities): supervisor crash classification + bare-name resolver + 58x perf + stub guard observability (garrytan#1085)
  v0.35.3.1 feat(eval): temporal-aware contradiction probe + verdict enum (garrytan#1052)
  v0.35.3.0 fix wave: extract_facts items + git --no-recurse-submodules placement (garrytan#1053)

# Conflicts:
#	src/core/postgres-engine.ts
#	test/schema-bootstrap-coverage.test.ts
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants