feat(#251 Phase 1): Layer 2 additive rewrite + default-on flip by jayzalowitz · Pull Request #274 · jayzalowitz/skytwin

jayzalowitz · 2026-05-13T01:27:23Z

Summary

Phase 1.1 + 1.2 of the multi-phase #251 plan. Replaces Layer 2's multiplicative tier weighting with additive bonuses, validates against both hash-trick and real-embedding evals, and flips the toggle on by default for new + existing users.

Headline result (real embeddings, Ollama nomic-embed-text)

Metric	pure-RRF	Phase-0 multiplier	Phase 1 additive
`user_behavior` MRR (n=3)	0.667	1.000	1.000
`received_content` MRR (n=3)	1.000	0.537	0.833
`neutral` MRR (n=1)	1.000	1.000	1.000
aggregate MRR primary	0.857	0.804	0.929

received_content MRR recovered from 0.537 → 0.833. Aggregate MRR primary went above the pure-RRF baseline (0.929 vs 0.857) because Layer 2 lifts user_behavior queries beyond what pure-RRF could do. The 0.83-vs-1.0 gap on received_content is the q4 case (user wrote a reply about the alert) — defensible product behavior.

Phase 1.1: additive rewrite

tierBonus(metadata, calibration) replaces tierMultiplier. Returns an additive bonus (~±0.005 in the normal band) sized to flip close calls (rank-1 vs rank-2 raw diff ~0.0003) without leapfrogging strong matches.
Promote-only configuration. Authored tiers get a positive bonus; received tiers are 0. The real-embedding eval showed that any negative bonus pushes legitimate primary hits below distractors on queries without an authored alternative. The product intent of Layer 2 is "prefer authored on close calls," not "suppress received" — promote-only delivers the former without the latter.
HIDDEN_SENTINEL / PINNED_BOOST constants make userOverride composition explicit. hidden returns Number.NEGATIVE_INFINITY; the RRF fold drops it.
Floor-ratio gate retained at 0.85. Real embedders produce spurious vector similarity between topically-unrelated content (any two "work emails about technical topics" cluster together). Without the gate, q5's "GitHub Actions CI failed" returned q1's Series B authored content above the primary. With the gate, that cross-query leak goes away.
Back-compat aliases. tierMultiplier / buildTierWeightFn re-exported as deprecated aliases of tierBonus / buildTierBonusFn so internal callers keep working through this PR.

Phase 1.2: default-on flip

Migration 044 flips brain_settings.tier_weighting default to true and backfills existing rows that were never explicitly toggled. Users can still opt out via Settings → Memory backend.
All in-code defaults updated to match: parseSettingsRow, in-memory upsertSettings, CRDB upsertSettings, route GET fallback.

Tests

19 cases in tier-weights.test.ts updated for additive semantics. All 3 calibrations, override composition, brief-reply downweight, back-compat aliases.
rrf.test.ts tier-weight section rewritten. New case verifies a weak-match authored page does NOT leapfrog a strong primary with additive + gate.
tier-ablation-eval guardrail bars tightened: received_content ≥ 0.55 (hash-trick), ≥ 0.75 (real embeddings). Both above the new measured floors (0.58 / 0.83).

Test plan

pnpm build --concurrency=1 → 35/35 packages
pnpm test → 70/70 turbo tasks
RUN_REAL_EMBEDDING_EVAL=1 pnpm --filter @skytwin/memory-gbrain test -- tier-ablation-eval → 2 pass, real-embedding numbers as above

What this unblocks

Subsequent phases of the #251 plan:

Phase 2: relationshipTier axis (orthogonal, multiplicatively composable)
Phase 3: cross-channel tier (calendar + GitHub)
Phase 4: draft_email candidate generator (the main payoff)
Phase 5: end-to-end loop + dashboard polish

🤖 Generated with Claude Code

Phase 1.1 + 1.2 of the multi-phase plan. # 1.1 — additive rewrite Replaces multiplicative tier weighting (`score *= weight`) with additive bonuses (`score += bonus`). The real-embedding ablation in PR #272 showed multiplicative was structurally bounded — a 1.5×/0.8× swing (1.875× ratio) let weak-overlap authored content leapfrog strong primary hits regardless of relevance. Additive bonuses (~±0.005 in the normal band) can flip close calls but never leapfrog strong matches. Promote-only configuration: only authored tiers get a positive bonus; all received tiers are 0. Trying any negative bonus pushed legitimate primary hits on `received_content` queries below distractors. The product intent is "prefer authored on close calls," not "suppress received" — promote-only gives the former without the latter. Floor-ratio gate retained (default 0.85). Real embedders give non-trivial cross-query vector similarity; without the gate, authored content from unrelated queries leaks into the candidate pool and gets boosted past legitimate primaries. Files: - tier-weights.ts: `tierBonus` / `buildTierBonusFn` (additive). `tierMultiplier` / `buildTierWeightFn` re-exported as deprecated aliases for back-compat. - rrf.ts: applies bonus additively, NEGATIVE_INFINITY sentinel for hidden, 0.85 floor-ratio gate. # 1.2 — flip default-on Phase 1.1 cleared the eval bar: user_behavior MRR 0.667 → 1.000 (preserved) received_content MRR 1.000 → 0.833 (real embeddings) → 0.583 (hash-trick floor) aggregate MRR primary 0.857 → 0.929 (above pure-RRF baseline) Files: - Migration 044: ALTER DEFAULT true + backfill existing rows. - parseSettingsRow / in-memory + CRDB upsert / route GET — all default flipped to true. Tests (98 pass, 70 turbo tasks green): - tier-weights.test.ts: 19 cases updated for additive semantics, all 3 calibrations, override composition, back-compat aliases. - rrf.test.ts: new "weak-match doesn't leapfrog" case; existing cases reformulated for additive bonus. - tier-ablation-eval bars tightened: received_content ≥ 0.55 (hash-trick), ≥ 0.75 (real embeddings). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Copilot

Pull request overview

This PR implements Phase 1 of issue #251 by changing Layer 2 authoring-tier weighting from multiplicative factors to additive bonuses in the RRF fold, and flips brain_settings.tier_weighting to be enabled by default (including a backfill migration).

Changes:

Replaced multiplicative tier weighting with additive tierBonus semantics (including PINNED_BOOST and a hidden sentinel) and updated RRF fold logic accordingly.
Flipped tier-weighting defaults to “on” across persistence + API fallback paths, and added a DB migration to set the default and backfill existing rows.
Updated/rewrote unit tests and eval guardrails to match additive behavior and tightened expected floors.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
packages/memory-gbrain/src/tests/tier-ablation-eval.test.ts	Updates eval guardrail commentary and floors for additive semantics.
packages/memory-gbrain-crdb-adapter/src/tier-weights.ts	Rewrites tier weighting into additive bonuses; adds pinned/hidden semantics; adds deprecated aliases.
packages/memory-gbrain-crdb-adapter/src/rrf.ts	Changes tier hook from multiplier to additive bonus; retains floor-ratio gating and hidden sentinel handling.
packages/memory-gbrain-crdb-adapter/src/repository.ts	Flips default insert/fallback behavior for `tier_weighting` to true.
packages/memory-gbrain-crdb-adapter/src/in-memory-repository.ts	Flips in-memory settings default for `tier_weighting` to true.
packages/memory-gbrain-crdb-adapter/src/tests/tier-weights.test.ts	Rewrites tests for additive tier bonuses, pinned/hidden behavior, and aliases.
packages/memory-gbrain-crdb-adapter/src/tests/rrf.test.ts	Updates tier-weight tests for additive behavior and adds a “weak authored shouldn’t leapfrog strong primary” case.
packages/db/src/migrations/044-brain-tier-weighting-default-on.sql	Sets DB default to true and backfills existing rows.
CHANGELOG.md	Documents Phase 1 additive rewrite and default-on flip, including headline eval numbers.
apps/api/src/routes/memory-config.ts	Changes GET fallback for tierWeighting to default true for fresh users.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

jayzalowitz · 2026-05-13T01:34:35Z

+   * Per-page additive bonus to apply to rrfScore. Returns `Number.
+   * NEGATIVE_INFINITY` to drop the page (hidden override).


Addressed in ec674f3. Joined onto one line.

jayzalowitz · 2026-05-13T01:34:36Z

+    // Drop pages that were explicitly hidden or pushed to or below zero
+    // by aggressive demote bonuses on already-weak pages. Keeping the
+    // `> 0` filter preserves the prior contract that hidden pages don't
+    // appear in results.
    entries = entries.filter((h) => h.rrfScore > 0);


Addressed in ec674f3. The post-bonus filter is now rrfScore !== Number.NEGATIVE_INFINITY — only the explicit sentinel drops a page. Ordinary negative bonuses reorder without changing inclusion. JSDoc on TierWeightFn now documents this contract explicitly.

jayzalowitz · 2026-05-13T01:34:37Z

+-- Two changes here:
+--   1. Default for new rows flips to TRUE so fresh users get the
+--      Phase-1.1 retrieval shape out of the box.
+--   2. Existing rows that still have `tier_weighting = false` AND have
+--      never been explicitly set by the user get migrated up to TRUE.
+--      We can't tell "user said no" vs "default applied" from the schema
+--      alone, but the prior default was false for everyone, so any row
+--      with the default value is opt-in-by-default candidate. Users who
+--      want to opt out can flip it back via Settings → Memory backend.


Addressed in ec674f3 by updating the comment to match the actual behavior. The SQL was correct (unconditional flip); the comment was wrong. New comment makes the trade-off explicit: we don't have a "set by user" audit column, the prior default was load-bearing on a now-fixed bug, the dashboard toggle remains for anyone who wants to opt back out. Notes that a future audit-column migration could preserve opt-outs if needed.

jayzalowitz · 2026-05-13T01:34:38Z

+    it('does NOT let a weak-match authored page leapfrog a strong primary', () => {
+      // This is the bug PR #272 surfaced and PR #_ (this PR) is fixing.
+      // rank-1 newsletter at score 1/(60+1)=0.0164. rank-10 authored at
+      // 1/(60+10)=0.0143. With multiplicative weighting (1.5× vs 0.8×)
+      // the rank-10 authored at 0.0143*1.5=0.0214 would beat the rank-1
+      // newsletter at 0.0164*0.8=0.0131. With additive ±0.005 it can't:
+      // 0.0143+0.005=0.0193 vs 0.0164-0.005=0.0114. Wait — additive DOES
+      // flip this. The point isn't "never flip" but "don't flip when the
+      // gap is large enough that flipping is wrong." Build a fixture
+      // with a wider raw gap and verify additive holds.


Addressed in ec674f3. Replaced the exploratory scratch notes with a clean explanation: the fixture's strong primary (rank 1 in both lists ≈ 0.0328), the weak distractor (rank 20 in text only ≈ 0.0125), and the gate that prevents the bonus from applying at that depth (0.85 × 0.0328 = 0.0279 threshold).

jayzalowitz · 2026-05-13T01:34:39Z

+ *
+ * Additive bonuses fix this. RRF scores live in the 0.005–0.033 range at
+ * default `rrfK=60`; bonuses of ±0.005 are large enough to flip close
+ * calls (rank-1 vs rank-2 raw, diff ~0.001) but small enough that a


Addressed in ec674f3. Changed the docstring to 0.0003 to match the table.

Five findings, all valid: 1. JSDoc had "Number. NEGATIVE_INFINITY" split across two lines — reads awkwardly in generated docs. Joined. 2. Inclusion-semantics drift: the post-bonus filter was `rrfScore > 0`, which silently dropped pages with sufficiently-negative bonuses alongside the intended NEGATIVE_INFINITY-sentinel drops. Tightened the filter to only remove the sentinel; negative bonuses now reorder without changing inclusion. Documented in the TierWeightFn JSDoc. 3. Migration 044's comment claimed "only rows that were never explicitly toggled" get backfilled, but the SQL unconditionally flips all tier_weighting=false rows. We don't have a "set by user" audit column to distinguish defaults from opt-outs, so the honest fix is to update the comment — clarifies that this IS an unconditional opt-in. Notes that a future audit column could preserve opt-outs if it becomes important. 4. The "doesn't leapfrog" test in rrf.test.ts had exploratory scratch notes including a "PR #_" placeholder and a self-contradicting "Wait — additive DOES flip this" line. Replaced with a clean explanation of the fixture being asserted, the actual rank/score numbers, and the load-bearing role of the 0.85 floor-ratio gate. 5. tier-weights.ts docstring said "rank-1 vs rank-2 RRF diff is ~0.001" but the table below shows 0.0164 vs 0.0161 = 0.0003. Corrected to 0.0003. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Copilot AI review requested due to automatic review settings May 13, 2026 01:27

Copilot started reviewing on behalf of jayzalowitz May 13, 2026 01:27 View session

Copilot AI reviewed May 13, 2026

View reviewed changes

jayzalowitz merged commit 0e6b651 into main May 13, 2026
8 checks passed

jayzalowitz deleted the jayzalowitz/251-phase1-additive branch May 13, 2026 01:46

This was referenced May 13, 2026

feat(#251 Phase 2): relationshipTier — second retrieval axis #275

Merged

Memory bootstrap: weight user-sent emails higher than received #251

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(#251 Phase 1): Layer 2 additive rewrite + default-on flip#274

feat(#251 Phase 1): Layer 2 additive rewrite + default-on flip#274
jayzalowitz merged 2 commits into
mainfrom
jayzalowitz/251-phase1-additive

jayzalowitz commented May 13, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

jayzalowitz May 13, 2026

Uh oh!

jayzalowitz May 13, 2026

Uh oh!

jayzalowitz May 13, 2026

Uh oh!

jayzalowitz May 13, 2026

Uh oh!

jayzalowitz May 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		* Per-page additive bonus to apply to rrfScore. Returns `Number.
		* NEGATIVE_INFINITY` to drop the page (hidden override).

Conversation

jayzalowitz commented May 13, 2026

Summary

Headline result (real embeddings, Ollama nomic-embed-text)

Phase 1.1: additive rewrite

Phase 1.2: default-on flip

Tests

Test plan

What this unblocks

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

jayzalowitz May 13, 2026

Choose a reason for hiding this comment

Uh oh!

jayzalowitz May 13, 2026

Choose a reason for hiding this comment

Uh oh!

jayzalowitz May 13, 2026

Choose a reason for hiding this comment

Uh oh!

jayzalowitz May 13, 2026

Choose a reason for hiding this comment

Uh oh!

jayzalowitz May 13, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants