feat(#251 Layer 2 + companion fields): tier-weighted gbrain retrieval#259
Conversation
Ships the second slice of #251 on top of PR #252's Layer 1 work. With this landing, gbrain can rank user-authored content above received noise on semantic-search results — gated behind a per-user toggle that's off by default until labeled-retrieval evals confirm recall@5 improvement. Engine: - Migration 043 adds `tier_weighting` (bool, default false) and `tier_calibration` (sparse/normal/dense, default normal) to `brain_settings`. `parseSettingsRow` defaults the new columns so a pre-043 install reads safely. - `tier-weights.ts` (new) — multiplier table with three calibration bands. Composes orthogonally with `metadata.userOverride` (pinned doubles, hidden drops the row from results) and includes a brief-reply downweight: short authored pages get `inbox_personal` weight so "k" replies can't outrank strategy emails. - `rrf.ts` takes an optional `tierWeight(metadata) => number` callback. Identity by default; multiplies into rrfScore pre-sort when set. `textRank`/`vectorRank` preserved for observability. - `embedded-port.searchInternal` reads `brain_settings.tier_weighting` per query, builds the weight fn from the calibration band, and passes it through to `hybridSearch`. Best-effort: DB error falls back to pure RRF, never blocks the search. - `buildPageMetadata` stamps `bodyLen` on every page so the brief-reply downweight has something to read. - `countUserSentPages` (new): SQL count for the calibration auto-recompute. - `getRecentPages` (new): newest-first page listing for the dashboard. API: - `GET /api/memory-config` surfaces `tierWeighting` + `tierCalibration`. - `POST /api/memory-config/tier-weighting` toggles the flag and auto-computes the calibration band from sent volume in the last 90d. Body accepts explicit `calibration` to override. - `GET /api/memory-config/dashboard` returns `pages.recent[]` with tier + override projected from metadata. Embeddings stripped from the wire. Web: - New "Weight what you wrote (Layer 2 — beta)" card on Settings → Memory backend. Single toggle, status + calibration readout, link to #251 for the rationale. - "What your twin remembers" gains a Recent pages table with a tier-badge column (`you wrote` / `personal` / `newsletter` / etc.) plus pinned/hidden override badges. Tests: - 16 unit cases in `tier-weights.test.ts` — multipliers, override composition, brief-reply downweight, calibration thresholds. - 3 new rrf.test.ts branches — flips ranking, drops hidden, preserves raw ranks under weighting. - 5 e2e cases in `tier-weighted-retrieval.test.ts` — baseline, weighting flips order, hidden disappears, brief-reply downweight, per-user isolation. - 5 new memory-config-routes.test.ts cases for the new POST endpoint. Deferred (separate sub-issues): - `relationshipTier` axis (bidirectional-thread-count based). - Backfill of `authoringTier` for pre-Layer-1 pages. - Tier-aware exclude UI (privacy sub-issue; will block default-on). - Layer 2 default-on rollout — eval-gated. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Implements “Layer 2” authoring-tier-weighted retrieval for gbrain, gated behind a per-user opt-in toggle and backed by new brain_settings fields + UI/dashboard surfacing.
Changes:
- Add tier-weight multipliers (with sparse/normal/dense calibration), including
userOverridecomposition and brief-reply downweighting. - Thread an optional
tierWeight(metadata) => numbercallback through RRF/hybrid search and enable it per-user viabrain_settings. - Expose and control the feature via API routes + web settings UI, and add dashboard “recent pages” with tier badges; add unit/e2e tests.
Reviewed changes
Copilot reviewed 16 out of 16 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| packages/memory-gbrain/src/embedded-port.ts | Stamps bodyLen into page metadata; resolves per-user tier weighting settings and passes a weight fn into hybrid search. |
| packages/memory-gbrain/src/tests/tier-weighted-retrieval.test.ts | New end-to-end-ish tests asserting ranking changes with tier weighting and override behavior. |
| packages/memory-gbrain/src/tests/embedded-port.test.ts | Updates expectations to account for always-present bodyLen in signal-derived page metadata. |
| packages/memory-gbrain-crdb-adapter/src/types.ts | Adds TierCalibration type and new required settings fields on BrainSettingsRow. |
| packages/memory-gbrain-crdb-adapter/src/tier-weights.ts | New tier multiplier tables + brief-reply heuristic + calibration helper. |
| packages/memory-gbrain-crdb-adapter/src/rrf.ts | Adds optional tierWeight hook to adjust rrfScore post-fold and optionally drop hits. |
| packages/memory-gbrain-crdb-adapter/src/repository.ts | Persists new settings fields; adds countUserSentPages and getRecentPages; threads tierWeight into rrfFold. |
| packages/memory-gbrain-crdb-adapter/src/index.ts | Re-exports new tier-weighting APIs and repository helpers. |
| packages/memory-gbrain-crdb-adapter/src/in-memory-repository.ts | Mirrors tierWeight support + calibration helpers for tests/in-memory mode. |
| packages/memory-gbrain-crdb-adapter/src/tests/tier-weights.test.ts | New unit tests for multipliers/calibration/overrides/brief-reply logic. |
| packages/memory-gbrain-crdb-adapter/src/tests/rrf.test.ts | New tests covering tierWeight ranking flips, hidden drops, and preserving raw ranks. |
| packages/db/src/migrations/043-brain-tier-weighting.sql | Migration adding tier_weighting + tier_calibration to brain_settings. |
| apps/api/src/routes/memory-config.ts | Adds GET fields, new POST toggle endpoint, and dashboard pages.recent[] projection. |
| apps/api/src/tests/memory-config-routes.test.ts | Adds route tests for the new endpoint and recent-pages dashboard behavior. |
| apps/web/public/js/pages/memory-settings.js | Adds UI toggle card and dashboard recent-pages table + tier badge rendering. |
| CHANGELOG.md | Documents the feature, toggle semantics, and test coverage. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if (options.tierWeight) { | ||
| const weight = options.tierWeight; | ||
| for (const hit of entries) { | ||
| const mult = weight(hit.page.metadata); | ||
| hit.rrfScore *= mult; | ||
| } |
There was a problem hiding this comment.
Addressed in ee02626. Non-finite or non-number returns coerce to 1.0 (identity); negatives clamp to 0 (matches the userOverride:'hidden' drop semantics). New rrf.test.ts case verifies NaN/undefined survive as identity and a negative weight drops the page, with all surviving scores finite + positive.
| } | ||
| showSavedToast(next ? 'Tier weighting enabled' : 'Tier weighting disabled'); | ||
| const container = document.getElementById('page-content'); | ||
| if (container) await renderMemorySettings(container, getCurrentUserId()); |
There was a problem hiding this comment.
Addressed in ee02626. Added a catch after the if (!res.ok) branch that calls showErrorToast('Failed to update tier weighting'), matching the dismiss-notification pattern. Offline / DNS / fetch-throws now surface to the user instead of silently re-enabling the button.
| function makeSignal( | ||
| id: string, | ||
| source: string, | ||
| type: string, | ||
| tier: string, | ||
| subject: string, | ||
| body: string, | ||
| bodyLen?: number, | ||
| ): RawSignal { | ||
| return { | ||
| id, | ||
| source, | ||
| type, | ||
| data: { | ||
| messageId: id, | ||
| from: 'someone@example.com', | ||
| subject, | ||
| text: body, | ||
| authoringTier: tier, | ||
| ...(bodyLen !== undefined ? { bodyLen } : {}), | ||
| }, |
There was a problem hiding this comment.
Addressed in ee02626. Removed the misleading parameter from makeSignal and added a comment explaining that bodyLen is derived from page content at write time — so to exercise the brief-reply downweight, tests vary the actual body string (e.g. "k").
Three findings from Copilot's review of the Layer 2 PR, all real: 1. **rrf.ts:88** — tierWeight was trusted to return a finite, sane number. A misbehaving callback returning NaN/Infinity could poison rrfScore; undefined would propagate through; negative would survive the >0 filter only by accident. Coerce non-numbers and non-finite values to 1.0 (identity); clamp negatives to 0 (matches the userOverride:'hidden' "drop the page" semantics). New rrf.test.ts case covers the path. 2. **memory-settings.js:128** — the tier-weighting toggle handler caught non-ok responses but didn't catch network exceptions thrown by fetch()/api(). Offline / DNS / similar would silently re-enable the button with no user feedback. Added the same `showErrorToast` catch the dismiss-notification path uses. 3. **tier-weighted-retrieval.test.ts:46** — makeSignal accepted a `bodyLen` parameter that was passed through to RawSignal.data, but the embedded port derives bodyLen from the summarised content, not from data. The parameter was misleading — implied an unused input influenced weighting. Removed the parameter; tests now vary the actual body string to exercise the brief-reply downweight. All four affected files build and test green. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Summary
Second slice of #251 on top of PR #252's Layer 1. With this landing, gbrain can rank what the user wrote above what they received on semantic search — gated behind a per-user toggle that's off by default until the labeled-retrieval evals confirm recall@5 improvement on a real corpus.
Closes (partial): #251 Layer 2 + bodyLen + userOverride + tier_calibration. Defers
relationshipTier, the pre-Layer-1 backfill, the privacy exclude UI, and the default-on rollout to follow-up sub-issues.What ships
Engine
tier_weighting(bool, default false) andtier_calibration(sparse/normal/dense, default normal) tobrain_settings.parseSettingsRowdefaults the new columns defensively so a pre-043 install reads safely.tier-weights.ts(new): multiplier table with three calibration bands.user_sent_originatedranges 1.2×/1.5×/2.0×;inbox_newsletterranges 0.5×/0.4×/0.3×. Composes orthogonally withmetadata.userOverride(pinneddoubles,hiddenreturns 0 → page is dropped from results). Includes a brief-reply downweight so auser_sent_replywithbodyLen < 50getsinbox_personalweight — short "k" replies can't out-rank strategy emails just because they'reSENT.rrf.tstakes an optionaltierWeight(metadata) => numbercallback. Identity by default; when set, the per-page multiplier applies torrfScorepre-sort.textRank/vectorRanksurvive for observability.embedded-port.searchInternalreads the per-usertier_weightingflag and calibration band per query, builds the weight fn, and threads it through tohybridSearch. Lookups are best-effort: any DB error falls back to pure RRF rather than blocking the search.buildPageMetadatastampsbodyLenon every page so the brief-reply downweight has something to read.countUserSentPages(new) +getRecentPages(new) on the adapter — backing the calibration auto-recompute and the dashboard's recent-pages list.API surface
GET /api/memory-confignow returnstierWeighting+tierCalibrationalongside the existing fields.POST /api/memory-config/tier-weighting(new). Body:{ enabled: boolean, calibration?: 'sparse' | 'normal' | 'dense' }. On enable without an explicitcalibration, countsuser_sent_*pages over the last 90d and picks the band; falls back tonormalon DB failure.GET /api/memory-config/dashboardnow includespages.recent[](10 newest brain pages) withauthoringTier+userOverrideprojected from metadata. Embedding vectors stripped from the wire.Web
you wrote,you replied,personal,broadcast,newsletter,automated, plus📌 pinned/hiddenfor explicit overrides. Color-coded for skimmability.Tests
tier-weights.test.ts— all three calibrations, override composition, brief-reply downweight, calibration thresholds.rrf.test.tsbranches — weighting flips ranking,hiddendrops the row, raw ranks preserved.tier-weighted-retrieval.test.ts— baseline (newsletter wins or ties); flag-on (authored hits index 0);userOverride: hiddendrops the page entirely; brief-reply downweight keeps a tinySENTfrom outranking a newsletter; per-user isolation.memory-config-routes.test.tscases for the new POST endpoint.pnpm build --concurrency=1— 35/35 packages.pnpm test— 70/70 turbo tasks green, no regressions.Notes for reviewers
realistic-retrieval.test.tsagainst a real production corpus with measurable recall@5 improvement, Layer 2 is opt-in. If you flip it on for yourself, the calibration band auto-recomputes from your last-90-day writing volume — sparse writers get conservative weights so we don't amplify a signal that isn't there.userOverrideschema only; no UI in this PR. Pinning / hiding individual pages will land with the tier-aware exclude sub-issue, which is the gate for flipping the flag on by default.tier_weighting = false(the default),searchSemanticruns the same code it ran before this PR.Deferred to follow-ups
relationshipTieraxis (bidirectional thread count) — separate sub-issue.authoringTierfor pages indexed pre-Layer 1.🤖 Generated with Claude Code