Skip to content

feat(#251 Layer 2 + companion fields): tier-weighted gbrain retrieval#259

Merged
jayzalowitz merged 2 commits into
mainfrom
jayzalowitz/251-authoring-tier
May 12, 2026
Merged

feat(#251 Layer 2 + companion fields): tier-weighted gbrain retrieval#259
jayzalowitz merged 2 commits into
mainfrom
jayzalowitz/251-authoring-tier

Conversation

@jayzalowitz

Copy link
Copy Markdown
Owner

Summary

Second slice of #251 on top of PR #252's Layer 1. With this landing, gbrain can rank what the user wrote above what they received on semantic search — gated behind a per-user toggle that's off by default until the labeled-retrieval evals confirm recall@5 improvement on a real corpus.

Closes (partial): #251 Layer 2 + bodyLen + userOverride + tier_calibration. Defers relationshipTier, the pre-Layer-1 backfill, the privacy exclude UI, and the default-on rollout to follow-up sub-issues.

What ships

Engine

  • Migration 043 adds tier_weighting (bool, default false) and tier_calibration (sparse/normal/dense, default normal) to brain_settings. parseSettingsRow defaults the new columns defensively so a pre-043 install reads safely.
  • tier-weights.ts (new): multiplier table with three calibration bands. user_sent_originated ranges 1.2×/1.5×/2.0×; inbox_newsletter ranges 0.5×/0.4×/0.3×. Composes orthogonally with metadata.userOverride (pinned doubles, hidden returns 0 → page is dropped from results). Includes a brief-reply downweight so a user_sent_reply with bodyLen < 50 gets inbox_personal weight — short "k" replies can't out-rank strategy emails just because they're SENT.
  • rrf.ts takes an optional tierWeight(metadata) => number callback. Identity by default; when set, the per-page multiplier applies to rrfScore pre-sort. textRank / vectorRank survive for observability.
  • embedded-port.searchInternal reads the per-user tier_weighting flag and calibration band per query, builds the weight fn, and threads it through to hybridSearch. Lookups are best-effort: any DB error falls back to pure RRF rather than blocking the search.
  • buildPageMetadata stamps bodyLen on every page so the brief-reply downweight has something to read.
  • countUserSentPages (new) + getRecentPages (new) on the adapter — backing the calibration auto-recompute and the dashboard's recent-pages list.

API surface

  • GET /api/memory-config now returns tierWeighting + tierCalibration alongside the existing fields.
  • POST /api/memory-config/tier-weighting (new). Body: { enabled: boolean, calibration?: 'sparse' | 'normal' | 'dense' }. On enable without an explicit calibration, counts user_sent_* pages over the last 90d and picks the band; falls back to normal on DB failure.
  • GET /api/memory-config/dashboard now includes pages.recent[] (10 newest brain pages) with authoringTier + userOverride projected from metadata. Embedding vectors stripped from the wire.

Web

  • New "Weight what you wrote (Layer 2 — beta)" card on Settings → Memory backend. Single toggle, status + calibration readout, link to Memory bootstrap: weight user-sent emails higher than received #251 for rationale.
  • "What your twin remembers" now leads with a Recent pages indexed table with a tier-badge column: you wrote, you replied, personal, broadcast, newsletter, automated, plus 📌 pinned / hidden for explicit overrides. Color-coded for skimmability.

Tests

  • 16 unit tests in tier-weights.test.ts — all three calibrations, override composition, brief-reply downweight, calibration thresholds.
  • 3 new rrf.test.ts branches — weighting flips ranking, hidden drops the row, raw ranks preserved.
  • 5 e2e cases in tier-weighted-retrieval.test.ts — baseline (newsletter wins or ties); flag-on (authored hits index 0); userOverride: hidden drops the page entirely; brief-reply downweight keeps a tiny SENT from outranking a newsletter; per-user isolation.
  • 5 new memory-config-routes.test.ts cases for the new POST endpoint.
  • pnpm build --concurrency=1 — 35/35 packages.
  • pnpm test — 70/70 turbo tasks green, no regressions.

Notes for reviewers

  • The toggle defaults to off. Until we re-run realistic-retrieval.test.ts against a real production corpus with measurable recall@5 improvement, Layer 2 is opt-in. If you flip it on for yourself, the calibration band auto-recomputes from your last-90-day writing volume — sparse writers get conservative weights so we don't amplify a signal that isn't there.
  • userOverride schema only; no UI in this PR. Pinning / hiding individual pages will land with the tier-aware exclude sub-issue, which is the gate for flipping the flag on by default.
  • No retrieval behaviour change when the flag is off. RRF is unchanged. The new code path is purely additive — when tier_weighting = false (the default), searchSemantic runs the same code it ran before this PR.
  • Calibration thresholds are educated guesses (<100 = sparse, >1000 = dense, otherwise normal). The eval gate that authorizes default-on will also re-tune these against the labeled set.

Deferred to follow-ups

  • relationshipTier axis (bidirectional thread count) — separate sub-issue.
  • Migration backfill of authoringTier for pages indexed pre-Layer 1.
  • Tier-aware exclude UI (privacy sub-issue; blocks Layer 2 default-on).
  • Layer 2 default-on rollout — eval-gated.

🤖 Generated with Claude Code

Ships the second slice of #251 on top of PR #252's Layer 1 work. With this
landing, gbrain can rank user-authored content above received noise on
semantic-search results — gated behind a per-user toggle that's off by
default until labeled-retrieval evals confirm recall@5 improvement.

Engine:
- Migration 043 adds `tier_weighting` (bool, default false) and
  `tier_calibration` (sparse/normal/dense, default normal) to
  `brain_settings`. `parseSettingsRow` defaults the new columns so a
  pre-043 install reads safely.
- `tier-weights.ts` (new) — multiplier table with three calibration
  bands. Composes orthogonally with `metadata.userOverride` (pinned
  doubles, hidden drops the row from results) and includes a brief-reply
  downweight: short authored pages get `inbox_personal` weight so "k"
  replies can't outrank strategy emails.
- `rrf.ts` takes an optional `tierWeight(metadata) => number` callback.
  Identity by default; multiplies into rrfScore pre-sort when set.
  `textRank`/`vectorRank` preserved for observability.
- `embedded-port.searchInternal` reads `brain_settings.tier_weighting`
  per query, builds the weight fn from the calibration band, and passes
  it through to `hybridSearch`. Best-effort: DB error falls back to
  pure RRF, never blocks the search.
- `buildPageMetadata` stamps `bodyLen` on every page so the brief-reply
  downweight has something to read.
- `countUserSentPages` (new): SQL count for the calibration auto-recompute.
- `getRecentPages` (new): newest-first page listing for the dashboard.

API:
- `GET /api/memory-config` surfaces `tierWeighting` + `tierCalibration`.
- `POST /api/memory-config/tier-weighting` toggles the flag and
  auto-computes the calibration band from sent volume in the last 90d.
  Body accepts explicit `calibration` to override.
- `GET /api/memory-config/dashboard` returns `pages.recent[]` with tier
  + override projected from metadata. Embeddings stripped from the wire.

Web:
- New "Weight what you wrote (Layer 2 — beta)" card on Settings →
  Memory backend. Single toggle, status + calibration readout, link to
  #251 for the rationale.
- "What your twin remembers" gains a Recent pages table with a
  tier-badge column (`you wrote` / `personal` / `newsletter` / etc.)
  plus pinned/hidden override badges.

Tests:
- 16 unit cases in `tier-weights.test.ts` — multipliers, override
  composition, brief-reply downweight, calibration thresholds.
- 3 new rrf.test.ts branches — flips ranking, drops hidden, preserves
  raw ranks under weighting.
- 5 e2e cases in `tier-weighted-retrieval.test.ts` — baseline,
  weighting flips order, hidden disappears, brief-reply downweight,
  per-user isolation.
- 5 new memory-config-routes.test.ts cases for the new POST endpoint.

Deferred (separate sub-issues):
- `relationshipTier` axis (bidirectional-thread-count based).
- Backfill of `authoringTier` for pre-Layer-1 pages.
- Tier-aware exclude UI (privacy sub-issue; will block default-on).
- Layer 2 default-on rollout — eval-gated.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 12, 2026 05:34

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Implements “Layer 2” authoring-tier-weighted retrieval for gbrain, gated behind a per-user opt-in toggle and backed by new brain_settings fields + UI/dashboard surfacing.

Changes:

  • Add tier-weight multipliers (with sparse/normal/dense calibration), including userOverride composition and brief-reply downweighting.
  • Thread an optional tierWeight(metadata) => number callback through RRF/hybrid search and enable it per-user via brain_settings.
  • Expose and control the feature via API routes + web settings UI, and add dashboard “recent pages” with tier badges; add unit/e2e tests.

Reviewed changes

Copilot reviewed 16 out of 16 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
packages/memory-gbrain/src/embedded-port.ts Stamps bodyLen into page metadata; resolves per-user tier weighting settings and passes a weight fn into hybrid search.
packages/memory-gbrain/src/tests/tier-weighted-retrieval.test.ts New end-to-end-ish tests asserting ranking changes with tier weighting and override behavior.
packages/memory-gbrain/src/tests/embedded-port.test.ts Updates expectations to account for always-present bodyLen in signal-derived page metadata.
packages/memory-gbrain-crdb-adapter/src/types.ts Adds TierCalibration type and new required settings fields on BrainSettingsRow.
packages/memory-gbrain-crdb-adapter/src/tier-weights.ts New tier multiplier tables + brief-reply heuristic + calibration helper.
packages/memory-gbrain-crdb-adapter/src/rrf.ts Adds optional tierWeight hook to adjust rrfScore post-fold and optionally drop hits.
packages/memory-gbrain-crdb-adapter/src/repository.ts Persists new settings fields; adds countUserSentPages and getRecentPages; threads tierWeight into rrfFold.
packages/memory-gbrain-crdb-adapter/src/index.ts Re-exports new tier-weighting APIs and repository helpers.
packages/memory-gbrain-crdb-adapter/src/in-memory-repository.ts Mirrors tierWeight support + calibration helpers for tests/in-memory mode.
packages/memory-gbrain-crdb-adapter/src/tests/tier-weights.test.ts New unit tests for multipliers/calibration/overrides/brief-reply logic.
packages/memory-gbrain-crdb-adapter/src/tests/rrf.test.ts New tests covering tierWeight ranking flips, hidden drops, and preserving raw ranks.
packages/db/src/migrations/043-brain-tier-weighting.sql Migration adding tier_weighting + tier_calibration to brain_settings.
apps/api/src/routes/memory-config.ts Adds GET fields, new POST toggle endpoint, and dashboard pages.recent[] projection.
apps/api/src/tests/memory-config-routes.test.ts Adds route tests for the new endpoint and recent-pages dashboard behavior.
apps/web/public/js/pages/memory-settings.js Adds UI toggle card and dashboard recent-pages table + tier badge rendering.
CHANGELOG.md Documents the feature, toggle semantics, and test coverage.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +83 to +88
if (options.tierWeight) {
const weight = options.tierWeight;
for (const hit of entries) {
const mult = weight(hit.page.metadata);
hit.rrfScore *= mult;
}

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in ee02626. Non-finite or non-number returns coerce to 1.0 (identity); negatives clamp to 0 (matches the userOverride:'hidden' drop semantics). New rrf.test.ts case verifies NaN/undefined survive as identity and a negative weight drops the page, with all surviving scores finite + positive.

}
showSavedToast(next ? 'Tier weighting enabled' : 'Tier weighting disabled');
const container = document.getElementById('page-content');
if (container) await renderMemorySettings(container, getCurrentUserId());

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in ee02626. Added a catch after the if (!res.ok) branch that calls showErrorToast('Failed to update tier weighting'), matching the dismiss-notification pattern. Offline / DNS / fetch-throws now surface to the user instead of silently re-enabling the button.

Comment on lines +26 to +46
function makeSignal(
id: string,
source: string,
type: string,
tier: string,
subject: string,
body: string,
bodyLen?: number,
): RawSignal {
return {
id,
source,
type,
data: {
messageId: id,
from: 'someone@example.com',
subject,
text: body,
authoringTier: tier,
...(bodyLen !== undefined ? { bodyLen } : {}),
},

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in ee02626. Removed the misleading parameter from makeSignal and added a comment explaining that bodyLen is derived from page content at write time — so to exercise the brief-reply downweight, tests vary the actual body string (e.g. "k").

Three findings from Copilot's review of the Layer 2 PR, all real:

1. **rrf.ts:88** — tierWeight was trusted to return a finite, sane number.
   A misbehaving callback returning NaN/Infinity could poison rrfScore;
   undefined would propagate through; negative would survive the >0 filter
   only by accident. Coerce non-numbers and non-finite values to 1.0
   (identity); clamp negatives to 0 (matches the userOverride:'hidden'
   "drop the page" semantics). New rrf.test.ts case covers the path.

2. **memory-settings.js:128** — the tier-weighting toggle handler caught
   non-ok responses but didn't catch network exceptions thrown by
   fetch()/api(). Offline / DNS / similar would silently re-enable the
   button with no user feedback. Added the same `showErrorToast` catch
   the dismiss-notification path uses.

3. **tier-weighted-retrieval.test.ts:46** — makeSignal accepted a
   `bodyLen` parameter that was passed through to RawSignal.data, but
   the embedded port derives bodyLen from the summarised content, not
   from data. The parameter was misleading — implied an unused input
   influenced weighting. Removed the parameter; tests now vary the
   actual body string to exercise the brief-reply downweight.

All four affected files build and test green.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants