Skip to content

feat(#251 Layer 1 + Layer 3 minimal): stamp authoring tier on email signals#252

Merged
jayzalowitz merged 3 commits into
mainfrom
jayzalowitz/issue-251-layer1-authoring-tier
May 12, 2026
Merged

feat(#251 Layer 1 + Layer 3 minimal): stamp authoring tier on email signals#252
jayzalowitz merged 3 commits into
mainfrom
jayzalowitz/issue-251-layer1-authoring-tier

Conversation

@jayzalowitz

Copy link
Copy Markdown
Owner

Summary

First slice of #251. Every inbound Gmail signal is now classified into one of six AuthoringTier buckets at write time and stamped onto RawSignal.data.authoringTier. The embedded gbrain port projects that field onto brain_pages.metadata so the downstream retrieval layer can read it without joining back to the signal row.

Closes (partial): #251 Layer 1 + minimal Layer 3. Layer 2 (retrieval weighting) is intentionally NOT in this PR — that one is gated on realistic-retrieval.test.ts improving and ships separately.

What changed

  • packages/connectors/src/authoring-tier.ts (NEW) — AuthoringTier type + classifyEmailAuthoringTier() classifier. Six tiers: user_sent_originated, user_sent_reply, inbox_personal, inbox_broadcast, inbox_newsletter, inbox_automated. Field name is channel-agnostic so Slack/Notion connectors can extend the enum later without a schema change.
  • packages/connectors/src/gmail-connector.ts
    • Adds To, Cc, In-Reply-To, List-Unsubscribe to the metadataHeaders= URL on messages/<id>?format=metadata.
    • messageToSignal() calls the classifier and stamps data.authoringTier on every emitted RawSignal.
    • Layer 3 minimal: bootstrapAndEmit() now lists in:sent newer_than:7d BEFORE is:unread newer_than:1d and dedupes by id. First-impression brain pages lead with what the user wrote rather than what happened to be unread. Either list failing degrades gracefully to the other.
  • packages/memory-gbrain/src/embedded-port.tsrecordSignal() reads data.authoringTier and projects it onto brain_pages.metadata.authoringTier when present. Falls back to the prior { signalSource, signalType } shape when missing or non-string (defensive against future connectors).
  • CHANGELOG entry explaining what shipped and what's deferred.

What this PR deliberately does NOT do

  • Layer 2 retrieval weighting. The RRF fold and DecisionMaker pattern boost still read pages uniformly. Layer 2 lands as a separate PR after running labeled-relevant-doc evals — if R@5 doesn't improve, the weights don't ship.
  • Migration backfill. Tier is only stamped on signals written after this lands. A backfill migration is cheap to write (re-read Gmail label + From header from stored signal rows) but is a separate concern from the live ingest path.

Test plan

  • 18 classifier unit tests in authoring-tier.test.ts covering all 6 tiers, edge cases (empty input, malformed addresses, SENT-dominance, newsletter-beats-automated when both fire, To+Cc summed for broadcast threshold).
  • 6 new tests in gmail-connector.test.ts covering the end-to-end tier stamping path (SENT originated, SENT reply, newsletter via List-Unsubscribe, automated via noreply sender, broadcast via multi-recipient To, personal default).
  • 1 new test for sent-first bootstrap ordering (bootstrap emits in:sent results before is:unread, deduped by id).
  • 3 new tests for embedded-port metadata projection (string tier projected, missing tier omitted, non-string tier defensively dropped).
  • Full workspace test run: 70 tasks, all green.
  • Workspace build clean (pnpm build --concurrency=1).
  • No lint errors (tsc --noEmit in affected packages).

Notes for reviewers

  • The classifier is intentionally email-shaped today but uses channel-agnostic naming (AuthoringTier, authoringTier) so a future cross-channel reshape (authored_originated / received_personal) is a values change, not a schema change.
  • The bootstrap dual-list adds one extra Gmail API call on first poll only; subsequent polls go through history.list exactly as before. Quota cost is two messages?q= calls + N messages/<id> detail fetches vs. one + N previously — negligible.
  • No retrieval behavior change. brain_pages.metadata is JSONB; the new field is purely additive.

🤖 Generated with Claude Code

…ignals

Every inbound Gmail signal is now classified into one of six AuthoringTier
buckets at write time and stamped onto `RawSignal.data.authoringTier`. The
embedded gbrain port projects that field onto `brain_pages.metadata` so the
downstream retrieval layer can read it without joining back to the signal row.

Why this exists (issue #251): emails the user *sent* are categorically
higher-signal for memory bootstrap than emails they *received* — authorship
implies endorsement + intent, while inbox mail is whatever others push at
them. Layer 1 only labels the data; Layer 2 retrieval weighting is a
separate PR gated on `realistic-retrieval.test.ts` improving.

Layer 3 minimal: the first poll for a new user now lists `in:sent newer_than:7d`
before `is:unread newer_than:1d` and dedupes by id, so the first brain pages
a user sees lead with things they wrote rather than the inbox noise that
happened to be unread.

Six tiers: `user_sent_originated`, `user_sent_reply`, `inbox_personal`,
`inbox_broadcast`, `inbox_newsletter`, `inbox_automated`. Field name is
channel-agnostic on purpose so Slack/Notion connectors can extend the enum
without rebuilding the memory schema.

Tests: 18 classifier unit tests, 6 Gmail signal-pipeline tier-stamping tests,
1 sent-first bootstrap ordering test, 3 embedded-port projection tests. Full
workspace test run: 70 tasks, all green.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 11, 2026 10:51

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Implements the first slice of issue #251 by stamping an authoringTier classification onto Gmail-derived RawSignal.data at ingest time, and projecting that value into brain_pages.metadata so future retrieval logic can consume it without joining back to the raw signal row. It also tweaks first-run Gmail bootstrap to prioritize sent mail for better “first-impression” memory pages.

Changes:

  • Add an AuthoringTier type plus a side-effect-free email classifier (classifyEmailAuthoringTier) with unit tests.
  • Stamp data.authoringTier in GmailConnector.messageToSignal() and fetch additional Gmail headers required for classification.
  • Project data.authoringTier into brain_pages.metadata.authoringTier in the embedded gbrain memory port, and adjust bootstrap ordering to process sent mail first with id dedupe.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
packages/memory-gbrain/src/embedded-port.ts Projects data.authoringTier into brain_pages.metadata via a helper builder.
packages/memory-gbrain/src/tests/embedded-port.test.ts Adds coverage for metadata projection behavior (present/missing/invalid tier).
packages/connectors/src/index.ts Re-exports authoring-tier classifier utilities and types from connectors package.
packages/connectors/src/gmail-connector.ts Fetches extra metadata headers, stamps authoringTier, and changes first-run bootstrap to sent-first + dedupe with a new list helper.
packages/connectors/src/authoring-tier.ts Introduces AuthoringTier and email-tier classification logic + helper functions.
packages/connectors/src/tests/gmail-connector.test.ts Adds end-to-end tests for tier stamping and sent-first bootstrap ordering/dedupe.
packages/connectors/src/tests/authoring-tier.test.ts Adds unit tests for address parsing, automated-sender detection, and tier classification.
CHANGELOG.md Documents Layer 1 tier stamping + minimal Layer 3 bootstrap ordering, and notes deferred Layer 2/backfill.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +190 to +194
console.warn(
`[gmail] Bootstrap list failed (${url}):`,
err instanceof Error ? err.message : String(err),
);
return [];
Comment on lines +68 to +70
* structurally do not host humans — only system-generated mail. Anchored to
* the end of the host so subdomains of legitimate human mail providers don't
* false-match.

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.

Comment on lines +178 to +196
/**
* Helper: GET a `users/me/messages?q=...` listing and return the message
* ids, swallowing network errors as an empty list (Layer 3 bootstrap should
* never hard-fail just because one of the two list queries returned 500 —
* the other batch can still serve the first-impression need).
*/
private async listMessageIds(url: string, accessToken: string): Promise<string[]> {
try {
const resp = await this.gmailGet(url, accessToken, 'list');
const body = await resp.json() as { messages?: Array<{ id: string }> };
return (body.messages ?? []).map((m) => m.id);
} catch (err) {
console.warn(
`[gmail] Bootstrap list failed (${url}):`,
err instanceof Error ? err.message : String(err),
);
return [];
}
}
Comment on lines +80 to +82
// match already catches that. The subdomain `noreply.` catch covers the
// long tail of `noreply.<vendor>.com` aliases used by SaaS apps.
/(^|\.)noreply\./i,
jayzalowitz and others added 2 commits May 11, 2026 19:20
…x anchoring docstring

Copilot's review of PR #252 caught two issues:

1. listMessageIds() swallowed all errors and returned [] — including
   persistent 401/403 auth failures, which silently bootstrapped zero
   signals forever. Now only RetryableHttpError (rate-limit / 5xx
   after retries) degrades to []; everything else propagates so the
   worker surfaces a real failure.

2. AUTOMATED_DOMAIN_PATTERNS docstring claimed all patterns were
   end-anchored to the apex domain, but noreply\. is intentionally
   not end-anchored (it catches noreply.<vendor>.com). Doc updated
   to describe the deliberate asymmetry.

Test plan: connectors 108/108 green. Build clean.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…storyId

Copilot's round-2 review of PR #252 caught that fetchProfileHistoryId
still swallowed all errors via catch { return null }. Combined with
the new symmetric behavior in listMessageIds, a persistent auth
failure on /users/me/profile would silently no-op the bootstrap and
the worker would loop without ever surfacing the broken token.

Now mirrors listMessageIds: RetryableHttpError degrades to null (the
caller writes no cursor and re-polls), everything else propagates so
the failure is visible.

Test plan: connectors 108/108 green.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@jayzalowitz

Copy link
Copy Markdown
Owner Author

Round-2 review reply:

  • listMessageIds error swallowing: Round-1 already changed this to propagate non-transient errors (commit cd9085d). Round-2's deeper point — that fetchProfileHistoryId() ALSO swallowed errors — is now addressed in fd302e2 (symmetric RetryableHttpError propagation).
  • AUTOMATED_DOMAIN_PATTERNS docstring: round-1 already rewrote the docstring (cd9085d) to explicitly call out the deliberate asymmetry — noreply\. is intentionally NOT end-anchored to catch noreply.<vendor>.com subdomain aliases. The current comment block describes this; the round-2 flag appears to be reading the original text.

@jayzalowitz jayzalowitz merged commit 4c7affb into main May 12, 2026
8 checks passed
@jayzalowitz jayzalowitz deleted the jayzalowitz/issue-251-layer1-authoring-tier branch May 12, 2026 04:22
jayzalowitz added a commit that referenced this pull request May 12, 2026
…#259)

* feat(#251 Layer 2 + companion fields): tier-weighted gbrain retrieval

Ships the second slice of #251 on top of PR #252's Layer 1 work. With this
landing, gbrain can rank user-authored content above received noise on
semantic-search results — gated behind a per-user toggle that's off by
default until labeled-retrieval evals confirm recall@5 improvement.

Engine:
- Migration 043 adds `tier_weighting` (bool, default false) and
  `tier_calibration` (sparse/normal/dense, default normal) to
  `brain_settings`. `parseSettingsRow` defaults the new columns so a
  pre-043 install reads safely.
- `tier-weights.ts` (new) — multiplier table with three calibration
  bands. Composes orthogonally with `metadata.userOverride` (pinned
  doubles, hidden drops the row from results) and includes a brief-reply
  downweight: short authored pages get `inbox_personal` weight so "k"
  replies can't outrank strategy emails.
- `rrf.ts` takes an optional `tierWeight(metadata) => number` callback.
  Identity by default; multiplies into rrfScore pre-sort when set.
  `textRank`/`vectorRank` preserved for observability.
- `embedded-port.searchInternal` reads `brain_settings.tier_weighting`
  per query, builds the weight fn from the calibration band, and passes
  it through to `hybridSearch`. Best-effort: DB error falls back to
  pure RRF, never blocks the search.
- `buildPageMetadata` stamps `bodyLen` on every page so the brief-reply
  downweight has something to read.
- `countUserSentPages` (new): SQL count for the calibration auto-recompute.
- `getRecentPages` (new): newest-first page listing for the dashboard.

API:
- `GET /api/memory-config` surfaces `tierWeighting` + `tierCalibration`.
- `POST /api/memory-config/tier-weighting` toggles the flag and
  auto-computes the calibration band from sent volume in the last 90d.
  Body accepts explicit `calibration` to override.
- `GET /api/memory-config/dashboard` returns `pages.recent[]` with tier
  + override projected from metadata. Embeddings stripped from the wire.

Web:
- New "Weight what you wrote (Layer 2 — beta)" card on Settings →
  Memory backend. Single toggle, status + calibration readout, link to
  #251 for the rationale.
- "What your twin remembers" gains a Recent pages table with a
  tier-badge column (`you wrote` / `personal` / `newsletter` / etc.)
  plus pinned/hidden override badges.

Tests:
- 16 unit cases in `tier-weights.test.ts` — multipliers, override
  composition, brief-reply downweight, calibration thresholds.
- 3 new rrf.test.ts branches — flips ranking, drops hidden, preserves
  raw ranks under weighting.
- 5 e2e cases in `tier-weighted-retrieval.test.ts` — baseline,
  weighting flips order, hidden disappears, brief-reply downweight,
  per-user isolation.
- 5 new memory-config-routes.test.ts cases for the new POST endpoint.

Deferred (separate sub-issues):
- `relationshipTier` axis (bidirectional-thread-count based).
- Backfill of `authoringTier` for pre-Layer-1 pages.
- Tier-aware exclude UI (privacy sub-issue; will block default-on).
- Layer 2 default-on rollout — eval-gated.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(#251 post-/review): address Copilot findings on PR #259

Three findings from Copilot's review of the Layer 2 PR, all real:

1. **rrf.ts:88** — tierWeight was trusted to return a finite, sane number.
   A misbehaving callback returning NaN/Infinity could poison rrfScore;
   undefined would propagate through; negative would survive the >0 filter
   only by accident. Coerce non-numbers and non-finite values to 1.0
   (identity); clamp negatives to 0 (matches the userOverride:'hidden'
   "drop the page" semantics). New rrf.test.ts case covers the path.

2. **memory-settings.js:128** — the tier-weighting toggle handler caught
   non-ok responses but didn't catch network exceptions thrown by
   fetch()/api(). Offline / DNS / similar would silently re-enable the
   button with no user feedback. Added the same `showErrorToast` catch
   the dismiss-notification path uses.

3. **tier-weighted-retrieval.test.ts:46** — makeSignal accepted a
   `bodyLen` parameter that was passed through to RawSignal.data, but
   the embedded port derives bodyLen from the summarised content, not
   from data. The parameter was misleading — implied an unused input
   influenced weighting. Removed the parameter; tests now vary the
   actual body string to exercise the brief-reply downweight.

All four affected files build and test green.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
jayzalowitz added a commit that referenced this pull request May 12, 2026
* feat(#251 follow-up): authoring-tier backfill worker

Pages indexed before Layer 1 of #251 had no `authoringTier` on metadata,
which silently disabled Layer 2 for their corpora. This adds a worker
that fills in the tier retroactively, plus the connector now persists
the raw classification headers so reclassification works going forward.

Engine:
- New adapter helper `findPagesMissingAuthoringTier(userId|null, limit)`
  joins brain_pages ↔ brain_signals via `source_ref = id`, filters on
  pages where `metadata->>'authoringTier' IS NULL`, optional user scope.
- `apps/worker/src/jobs/tier-backfill.ts`: the job. Two reclassification
  paths:
    1. Trust the signal — copy `signal.data.authoringTier` to page
       metadata when it exists (post-#252 paths that bypassed the
       metadata projection for any reason).
    2. Reclassify — run the classifier locally on the raw `to` / `cc` /
       `inReplyTo` / `listUnsubscribe` / `listId` / `labels` headers.
  Pages whose signal carries neither path are counted as
  "unreclassifiable" and left alone — pre-Layer-1 signals that don't
  preserve classification headers need a Gmail re-fetch (separate
  sub-issue, lower priority).
- Gmail connector `messageToSignal` now also stamps `to`, `cc`,
  `inReplyTo`, `listUnsubscribe` on `signal.data` so future
  reclassification has source data. No behavior change to the existing
  classifier path; just preserves raw inputs.
- In-memory adapter mirror for tests.

Scheduling:
- Worker runs the job hourly (`TIER_BACKFILL_INTERVAL_MS = 60 * 60 *
  1000`). Idempotent: once a corpus is fully tagged the find query
  returns 0 rows and the pass becomes a no-op.
- Batch size 200 per pass, plenty for any reasonable mailbox to
  converge over a few hours.

Tests: 9 worker, 4 adapter. All green. 70/70 turbo tasks.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(#251 post-/review): address Copilot findings on PR #271

Three findings on the backfill worker, all valid:

1. findPagesMissingAuthoringTier did a bare `JSON.parse(row.signal_data)`
   when the driver returned JSONB as a string. One malformed signal row
   would have thrown and tanked the whole worker pass. Switched to the
   file-local `parseJson` helper (the same one parsePageRow / parseSettingsRow
   / etc. use) — returns null on parse failure; coerce to {} so the
   worker logs the row as "unreclassifiable" and keeps going.

2. Doc comment claimed "a thousand pages per cycle is the default in the
   worker" but the actual default is 200. Updated.

3. The worker was discarding updatePageMetadata's affected-row count.
   A 0 return (page disappeared between find + update, or ownership
   mismatch) was getting counted as a successful copy/reclass — silent
   data lie. Now treated as failed: incremented `summary.failed`,
   logged with pageId/userId, no copiedFromSignal/reclassified bump.
   New unit test covers the race path.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants