v0.6.52.0 sync(memory): align floor-ratio gate with gbrain v0.35.6.0 / PR #1129#334
Conversation
…/ PR #1129 Our contribution PR #1091 was closed in favor of upstream's reworked shape that merged yesterday as #1129. The codex outside-voice review caught three defensive gaps in the original shape; port the fixes here and align naming with `SearchOpts.floorRatio` / `search.floor_ratio`. Hardened in `packages/memory-gbrain-crdb-adapter/src/rrf.ts`: - No-positive-signal inputs (all-negative, all-NaN, empty) disable the gate via `Number.NEGATIVE_INFINITY` threshold. Prior `topRawScore = 0` init would silently reject every entry against `r.score < 0`. - Out-of-range `floorRatio` (NaN, Infinity, negative, > 1) disables the gate. Defense in depth so a malformed config value never gates anything. - NaN-score skip in the bonus loop. `NaN < threshold` is `false` in JS, so a NaN-scored hit would slip past the gate check and have the bonus added on top — poisoning the sort. Now an explicit `Number.isFinite` check skips the bonus stage for non-finite scores. New surface: - `RrfFoldOptions.floorRatio` (deprecated alias `tierWeightFloorRatio` preserved; new name wins when both are set). - `computeFloorThreshold(entries, floorRatio)` exported helper, mirrors gbrain's same-named function for cross-port mental-model consistency. - `DEFAULT_FLOOR_RATIO = 0.85` exported as a named constant. Tests: - 12 new cases pinning the defensive guards (out-of-range / NaN / Infinity / empty / negative-top / all-NaN / mixed), the precedence rule between the new and deprecated option names, and an updated strong-vs-tail RRF setup that actually exercises the gate (RRF flatness means rank-1 vs rank-2 don't differ enough — you need rank-20+ in a single list). - 130/130 RRF tests pass; 100/100 `@skytwin/memory-gbrain` tests pass; the realistic-retrieval ablation reports `mean R@5 1.000 pure-RRF / 0.929 tier-on`, unchanged. Upstream feature triage (filed for follow-up, not in this PR): - #897 search-lite (token budget + semantic query cache + intent weighting) — pursue first, ~2 days. Token budget addresses Claude API limits. - #1008 zerank-2 reranker — pursue second, ~1.5 days. Slots between RRF fold and tier-weight bonus. - #996 federated_read — skip (one brain per user). - #1131 temporal trajectory — defer (entity-time-series shape not our fit). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This PR bumps the release version and aligns the memory-gbrain CRDB adapter’s RRF floor-ratio gate with upstream gbrain semantics, including defensive handling for invalid ratios and non-positive/non-finite score inputs.
Changes:
- Adds
floorRatio,DEFAULT_FLOOR_RATIO, and exportedcomputeFloorThreshold. - Updates RRF tier-weight gating to use the new helper and preserve the deprecated alias.
- Adds tests and changelog documentation for the upstream alignment.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
VERSION |
Bumps version to 0.6.52.0. |
packages/memory-gbrain-crdb-adapter/src/rrf.ts |
Adds floor-threshold helper and updates tier-weight gate behavior. |
packages/memory-gbrain-crdb-adapter/src/index.ts |
Exports the new helper and default ratio. |
packages/memory-gbrain-crdb-adapter/src/__tests__/rrf.test.ts |
Adds coverage for floor-ratio behavior and helper semantics. |
CHANGELOG.md |
Documents the release and upstream alignment. |
Comments suppressed due to low confidence (3)
packages/memory-gbrain-crdb-adapter/src/tests/rrf.test.ts:247
- This test does not actually verify that an out-of-range negative
floorRatiodisables the gate: theweakhit is rank 2, so its raw RRF score (1/62) is still above the default0.85 * (1/61)threshold and would receive the bonus even if the invalid value were incorrectly ignored. Use a tail hit far enough below the default threshold (as instrongVsTail) so the assertion fails unless the gate is truly disabled.
it('out-of-range floorRatio (negative) disables gate — bonus applies to all', () => {
const text = [
{ page: pageWithTier('top', 'user_sent_originated'), score: 1 },
{ page: pageWithTier('weak', 'user_sent_originated'), score: 0.1 },
];
const out = rrfFold(text, [], 5, 60, {
tierWeight: () => 0.005,
floorRatio: -0.5,
});
packages/memory-gbrain-crdb-adapter/src/tests/rrf.test.ts:260
- This test has the same coverage gap as the negative-ratio case: with only rank-1 and rank-2 text hits, the
weakhit passes the normal default floor gate, so the assertion would still pass iffloorRatio: 1.5were mishandled. Make the weak candidate fall below the default threshold to prove> 1disables the gate.
it('out-of-range floorRatio (>1) disables gate', () => {
const text = [
{ page: pageWithTier('top', 'user_sent_originated'), score: 1 },
{ page: pageWithTier('weak', 'user_sent_originated'), score: 0.1 },
];
const out = rrfFold(text, [], 5, 60, {
tierWeight: () => 0.005,
floorRatio: 1.5,
packages/memory-gbrain-crdb-adapter/src/tests/rrf.test.ts:273
- This test would pass even if
NaNwere replaced by the default floor ratio, because the rank-2 hit is still above0.85 * topand gets the bonus under the normal gate. Use a candidate below the default threshold so the test specifically covers the NaN-disables-gate behavior.
it('NaN floorRatio disables gate', () => {
const text = [
{ page: pageWithTier('top', 'user_sent_originated'), score: 1 },
{ page: pageWithTier('weak', 'user_sent_originated'), score: 0.1 },
];
const out = rrfFold(text, [], 5, 60, {
tierWeight: () => 0.005,
floorRatio: NaN,
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| it('floorRatio: 0 (still a valid range value) disables the bonus completely', () => { | ||
| // 0 is a valid in-range floor and means "only the absolute top hit is | ||
| // eligible." But our top is rank-1 itself; everything else fails. Pins | ||
| // that we don't treat 0 as "disable" — the explicit disable is undefined | ||
| // OR out-of-range, not 0. |
There was a problem hiding this comment.
Addressed in commit 89fd6be — renamed to floorRatio: 0 is a valid in-range value — every positive-score hit gets the bonus and rewrote the inline comment to match what the test actually asserts (threshold computes to 0, every positive score passes the >= gate).
|
|
||
| ### Tests | ||
|
|
||
| - 12 new test cases in `packages/memory-gbrain-crdb-adapter/src/__tests__/rrf.test.ts`: `computeFloorThreshold` defensive guards (undefined / out-of-range / NaN / Infinity / empty / negative-top / all-NaN / mixed), floorRatio precedence over deprecated `tierWeightFloorRatio`, back-compat alias still respected, and the strong-vs-tail RRF construction that actually exercises the gate (RRF flatness means rank-1 vs rank-2 don't differ enough — you need rank-20+ in a single list to push under 0.85 × top). |
There was a problem hiding this comment.
Good catch — count drifted across review fixes. Updated to 22 in a30858a (the actual new it(...) count from git diff origin/main). Going forward: stop hand-counting in the CHANGELOG and source it from the diff.
| // NaN-score defense (gbrain v0.35.6.0 / codex outside-voice T1a): a | ||
| // non-finite rrfScore would slip past `hit.rrfScore < threshold` | ||
| // because `NaN < x` is false in JS, then get the bonus added and | ||
| // poison the sort. Explicitly skip the bonus for non-finite scores; | ||
| // they keep their (non-finite) value and sort to the end. | ||
| if (!Number.isFinite(hit.rrfScore)) continue; |
There was a problem hiding this comment.
Addressed in 89fd6be — the post-loop filter now drops ALL non-finite scores via Number.isFinite (was: only -Infinity hidden sentinel), so NaN/+Infinity never reach the b.rrfScore - a.rrfScore comparator. Same filter applies on the pure-RRF path so a caller passing rrfK: NaN can't corrupt top-k ordering either. Two new tests pin the failure mode.
Three findings from /review's codex outside-voice pass + one nit from the structured Pass-1/Pass-2 review. T2 — Invalid floorRatio bypassing legacy guard. Old precedence `options.floorRatio ?? options.tierWeightFloorRatio ?? DEFAULT_FLOOR_RATIO` meant `floorRatio: NaN` (e.g. from buggy config parse) won the chain and disabled the gate, even if the caller had `tierWeightFloorRatio: 0.85` working. A partially migrated caller piping a malformed new option silently nullified the legacy guard. New `pickValidFloorRatio` helper walks the candidates and uses the first finite value in [0, 1]; invalid falls through to the alias, then to `DEFAULT_FLOOR_RATIO`. T3 — NaN/+Infinity rrfScores surviving the sort. The comment claimed non-finite scores "sort to the end," but `b.rrfScore - a.rrfScore` returns `NaN` for any NaN side, which JS sort treats as 0 (equal) — leaving NaN-scored hits in insertion order, where they can land in top-k via `slice(0, k)`. `+Infinity` sorts to the top of every query. Reachable when a caller passes `rrfK: NaN` (which makes every `1 / (rrfK + rank)` NaN). Fix: the post-loop filter drops ALL non-finite-scored entries (was: only `-Infinity` hidden sentinel), and a mirror filter applies on the pure-RRF path so corrupted scores never reach the comparator. Sort now operates only on finite scores and produces a deterministic ranking. Nit — Test `floorRatio: 0 disables the bonus completely` name + comment contradicted the test's own assertions (which confirm the bonus IS applied for every positive-score hit). Renamed to match the actual behavior: `floorRatio: 0` is a valid in-range value, threshold computes to 0, every positive-score hit passes the gate. Distinct from the `undefined`/out-of-range disable path even though they're observationally equivalent for positive-score inputs. 5 new test cases pin the codex fixes: - invalid floorRatio falls back to deprecated alias when alias is valid - invalid floorRatio + invalid alias falls back to DEFAULT_FLOOR_RATIO - floorRatio: undefined falls through to alias when alias is valid - rrfK: NaN corrupts all contributions → all hits dropped (output []) - partial corruption (some finite hits) survives the filter intact CHANGELOG updated to reflect: 135 tests (was 130), and the codex review fixes are called out under a "Codex review fixes (post-review)" subsection so the audit trail is visible. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…o test setup Copilot caught a coverage gap in three new tests: the rank-1-vs-rank-2 text-only setup doesn't actually exercise the gate, because rank-2 rrfScore (1/62 ≈ 0.0161) is above the default 0.85 × 1/61 (≈ 0.0139) — so the assertion passes regardless of whether `floorRatio: NaN` (or -0.5, or 1.5) correctly disabled the gate, fell back to default, or did anything at all. Same class as the gap I caught and fixed in the back-compat tests; missed updating these three. Rewritten to use the `strongVsTail` helper (rank-1-in-both + rank-21-text-only) so the assertions distinguish "gate at 0.85" from "gate disabled" — the weak hit's rrfScore is 1/81 (well below 0.85 × 2/61 = 0.0279), so the bonus only applies if the gate is genuinely disabled. Note: the test semantics also flipped because of the codex T2 fix landed in 89fd6be. Pre-T2, invalid `floorRatio` disabled the gate. Post-T2, invalid falls back to the alias then to DEFAULT_FLOOR_RATIO. So the renamed tests now assert "falls back to DEFAULT_FLOOR_RATIO" rather than "disables gate." The test rationale comment block calls this out explicitly so a future maintainer doesn't try to revert to the pre-T2 expectations. CHANGELOG test count corrected: 22 new test cases (was 17, originally 12 — my mistake; the count drifted across each round of review fixes). Copilot's other two comments were already addressed in 89fd6be: - "Test name `floorRatio: 0 disables bonus` contradicts assertions" → fixed - "NaN-score skip leaves non-finite rrfScore in entries; sort can be poisoned" → fixed (post-loop `isFinite` filter drops non-finite scores before sort). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…g + floor-ratio gate) CLAUDE.md:102 said the package "multiplies fused scores by per-tier weights" — stale since #260/#272 flipped the implementation to additive bonuses (the multiplicative cut had a structural leapfrog regression on real dense embedders). Updated to describe the actual current behavior + the opt-in `floorRatio` gate aligned with gbrain v0.35.6.0. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
Our contribution to gbrain (PR #1091) was closed yesterday in favor of upstream's reworked shape that merged as PR #1129. The codex outside-voice review caught three defensive gaps in our original shape. This PR ports those fixes into our additive tier-weight implementation in
packages/memory-gbrain-crdb-adapter/src/rrf.tsand aligns the option naming with upstreamSearchOpts.floorRatio/search.floor_ratio.What got hardened
topRawScore = 0init meant all-negative-score inputs silently rejected every entry againstr.score < 0. New behavior:computeFloorThresholdreturnsNumber.NEGATIVE_INFINITYfor all-negative, all-NaN, or empty inputs.floorRatiodisables the gate. NaN, Infinity, negative, or> 1now return-Infinityfrom the threshold helper. Defense in depth.NaN < thresholdisfalsein JS — would slip past the gate and poison the sort. Now skipped explicitly viaNumber.isFinite(hit.rrfScore).API surface
RrfFoldOptions.tierWeightFloorRatio→RrfFoldOptions.floorRatio(deprecated alias preserved; new name wins).computeFloorThreshold(entries, floorRatio)andDEFAULT_FLOOR_RATIO = 0.85.GbrainMemoryPortCLI shellout unchanged — users cangbrain config set search.floor_ratioglobally outside our integration.Tests
rrf.test.tspin all three defensive guards plus the precedence rule forfloorRatiovs deprecatedtierWeightFloorRatio.@skytwin/memory-gbraintests pass.mean R@5 1.000 pure-RRF / 0.929 tier-on.Follow-ups (filed in the CHANGELOG, not in this PR)
#897 search-lite(token budget + semantic query cache + intent weighting) — pursue first, ~2 days.#1008 zerank-2 reranker— pursue second, ~1.5 days.#996 federated_read— skip (one brain per user).#1131 temporal trajectory— defer (not our shape).Test plan
pnpm --filter @skytwin/memory-gbrain-crdb-adapter test— 130/130 passpnpm --filter @skytwin/memory-gbrain test— 100/100 passpnpm build --concurrency=1— cleanDocumentation
Audited all top-level docs +
docs/directory against the diff.Updated:
CLAUDE.md:102— package description for@skytwin/memory-gbrain-crdb-adaptersaid "multiplies fused scores by per-tier weights." Stale since feat(#251 follow-up): Layer 2 ablation eval + tuned multipliers + floor gate #260/feat(#251 follow-up): real-embedding ablation result + opt-in test #272 flipped the implementation to additive bonuses. Updated to describe the current additive scoring + the opt-infloorRatiogate (default 0.85; aligned with gbrain v0.35.6.0SearchOpts.floorRatio).CHANGELOG.md— voice already passes the sell test (what / why / how), polished count from 17 → 22 new tests after the post-Copilot test rewrite.Coverage map (Diataxis):
RrfFoldOptions.floorRatiocomputeFloorThresholdDEFAULT_FLOOR_RATIORrfFoldOptions.tierWeightFloorRatio(deprecated)Coverage: all shipped surface has adequate documentation. No tutorial gap because these are internal scoring options, not user-facing actions.
Cross-doc consistency:
0.6.52.0) matches CHANGELOG heading (## [0.6.52.0]).docs/memory-swap.mdpackage summaries are still accurate at their level of abstraction (no API specifics referenced).docs/mentiontierWeightFloorRatioor other renamed symbols — no further drift.Documentation health: