Skip to content

security: enforce path-authoritative slug in importFromFile#47

Closed
garagon wants to merge 1 commit into
garrytan:masterfrom
garagon:security/l001-frontmatter-slug-hijack
Closed

security: enforce path-authoritative slug in importFromFile#47
garagon wants to merge 1 commit into
garrytan:masterfrom
garagon:security/l001-frontmatter-slug-hijack

Conversation

@garagon

@garagon garagon commented Apr 11, 2026

Copy link
Copy Markdown
Contributor

Summary

parseMarkdown returns frontmatter.slug when the frontmatter declares one, and falls back to inferSlug(filePath) otherwise. importFromFile then passes that parsed slug verbatim into importFromContent. In a team brain where contributors can land PRs, a poisoned notes/random.md with frontmatter

---
slug: people/elon
type: person
title: Elon Musk
---

silently overwrites the legitimate people/elon page on the next gbrain sync or gbrain import — the sync filter (isSyncable() in src/core/sync.ts) only checks file extensions and never inspects frontmatter, so a reviewer sees a harmless new note in an unfamiliar directory and merges it. The attack is recoverable via page_versions (the import pipeline snapshots the old content before overwrite) but only if the operator notices.

The realistic damage isn't data loss — it's silent context poisoning of an LLM that reads the brain through gbrain's MCP tools.

Changes

src/core/import-file.ts

  • Imports slugifyPath from ./sync.ts.
  • Inside importFromFile, after parseMarkdown(content, relativePath), computes expectedSlug = slugifyPath(relativePath) and rejects the import when parsed.slug !== expectedSlug. The rejection returns { status: 'skipped', chunks: 0, error: 'Frontmatter slug \"X\" does not match path-derived slug \"Y\" (from path). Remove the frontmatter \"slug:\" line or move the file.' } — same ImportResult shape the existing "file too large" path already uses.
  • Passes expectedSlug (not parsed.slug) into importFromContent on the match path too, so a future change to parseMarkdown's precedence rules cannot silently re-introduce this bug.
  • Added a docblock explaining the slug-authority rule and the threat model.

test/import-file.test.ts

  • rejects frontmatter slug that does not match the file path — a hijack file at notes/random.md with slug: people/elon in its frontmatter is rejected with status: 'skipped', and the error message contains both slugs. Critically, the mock engine sees zero calls — no partial write, no version snapshot of the target page, no tag reconciliation. This is what makes it safe for a caller to retry or ignore.
  • accepts frontmatter slug that matches the file path — a legitimate file whose frontmatter slug happens to equal the path-derived slug (common for brains exported from other tools that write both) still imports cleanly. This pins the "match" branch so a future regression doesn't break legitimate redundant-slug imports.
  • uses path-derived slug when no frontmatter slug is set — pins the authority on the filesystem path even when parseMarkdown's fallback would produce the same answer.

Scope decisions

Why not also fix it at parseMarkdown: parseMarkdown has a second caller path — importFromContent(slug, content) — where the function receives a caller-supplied slug and no filesystem path to compare against. Changing parseMarkdown to ignore frontmatter.slug entirely would make that path lose the ability to round-trip existing pages (the MCP put_page operation). The slug-authority rule only makes sense when there's a disk path to be authoritative against, which is why the check has to live on importFromFile.

Why not also check frontmatter.type: Same hijack primitive exists for type, but type is a classification label, not a primary key — a wrong type renders the page in the wrong category but doesn't overwrite a different page. Lower severity, leaving it as a separate follow-up so this PR stays focused on the actual overwrite vector.

Why not silently ignore frontmatter.slug instead of rejecting: Silent override would mask user mistakes. If a user has a brain where frontmatter slugs have drifted from the filesystem layout, a clear error message tells them exactly what to fix; silent ignore would leave them wondering why gbrain search doesn't find the page they just synced.

Validation

The PoC at report/evidence/poc-l001-frontmatter-slug-hijack.ts (from the upstream audit, not part of this diff) runs 9 static + runtime checks. It requires all 9 to PASS for the vuln to be confirmed. One regex was widened in this run to also match the indirect const expectedSlug = slugifyPath(relativePath); if (parsed.slug !== expectedSlug) pattern, in addition to the direct inline comparison — otherwise check 4 would have been a false negative against the fix.

Before this PR:

[PASS] markdown.ts: frontmatter.slug preferred over inferSlug(filePath)
[PASS] markdown.ts: frontmatter.type preferred over inferType(filePath)
[PASS] import-file.ts: importFromFile passes parsed.slug through (not relativePath)
[PASS] import-file.ts: NO consistency check between parsed.slug and relativePath
[PASS] sync.ts: performSync calls importFile(engine, filePath, path, ...) per file
[PASS] sync.ts: NO consistency check between file path and imported slug
[PASS] core/sync.ts: isSyncable() never inspects frontmatter content
[PASS] runtime: parseMarkdown returns frontmatter slug, not path slug
[PASS] runtime: parseMarkdown returns frontmatter type, not path type
vuln_confirmed=1  (exit 1)

After this PR:

[PASS] markdown.ts: frontmatter.slug preferred over inferSlug(filePath)
[PASS] markdown.ts: frontmatter.type preferred over inferType(filePath)
[FAIL] import-file.ts: importFromFile passes parsed.slug through (not relativePath)
[FAIL] import-file.ts: NO consistency check between parsed.slug and relativePath
[PASS] sync.ts: performSync calls importFile(engine, filePath, path, ...) per file
[PASS] sync.ts: NO consistency check between file path and imported slug
[PASS] core/sync.ts: isSyncable() never inspects frontmatter content
[PASS] runtime: parseMarkdown returns frontmatter slug, not path slug
[PASS] runtime: parseMarkdown returns frontmatter type, not path type
vuln_confirmed=0  (exit 0)

Two checks flipped — the ones that specifically measured importFromFile's behavior. The other checks remain PASS because they describe the unchanged parseMarkdown behavior and the unchanged sync.ts flow, which is correct: the fix lives at the importFromFile boundary, not deeper in the stack. Any caller that reaches the brain through importFromFile (CLI import, CLI sync, auto-sync cron) is protected.

Full unit suite:

bun test
 340 pass
 122 skip
 0 fail
 1164 expect() calls

Backwards compatibility

Behavior change for users whose frontmatter slugs drift from the filesystem layout. If a user's brain has a file at notes/a.md with frontmatter slug: people/alice, the import now returns { status: 'skipped', error: ... } instead of silently rewriting the people/alice page. The error message tells them exactly how to fix it: remove the frontmatter line or move the file.

I checked whether frontmatter-slug-as-override is documented anywhere — it isn't. serializeMarkdown (src/core/markdown.ts:99) doesn't emit a slug: line, pathToSlug (src/core/sync.ts:131) is explicitly a path-to-slug converter, and the README / CLAUDE.md describe the filesystem layout as authoritative. Users who had frontmatter slugs that happened to match the path are unaffected (covered by the accepts matching frontmatter slug test).

Test plan

  • bun test unit tests pass
  • bun run test:e2e E2E tests pass against real Postgres + pgvector
  • Create a fresh brain with gbrain init, add two files:
    • people/alice-smith.md — frontmatter slug: people/alice-smith, imports normally
    • notes/random.md — frontmatter slug: people/elon, returns skipped with the mismatch error
  • Verify people/elon does not exist in the brain after the second import
  • Run gbrain sync on a repo that contains both files — same results, sync continues past the rejected file

Not in this PR

  • frontmatter.type mismatch handling (lower-severity variant of the same primitive, separate PR).
  • Warning vs. rejection mode — current choice is hard rejection. If feedback says otherwise, easy to change.
  • A parseMarkdown signature that exposes hadFrontmatterSlug / hadFrontmatterType flags for other callers that want their own authority rules. Not needed for this fix.

This PR is independent of #45 (scope enforcement) and #46 (importFromContent size guard) — it touches a different function and different code path.

parseMarkdown returns `frontmatter.slug` when the frontmatter
declares one, and falls back to `inferSlug(filePath)` otherwise.
importFromFile then passes that parsed slug verbatim into
importFromContent. In a team brain where contributors can land
PRs, a poisoned `notes/random.md` with frontmatter

    slug: people/elon
    type: person
    title: Elon Musk

silently overwrites the legitimate `people/elon` page on the
next `gbrain sync` or `gbrain import`. The existing version
history is preserved, so the attack is recoverable — but only
if the operator notices, which is unlikely given the audit log
limitations tracked separately in F001.

Fix: inside importFromFile, compute the path-derived slug via
`slugifyPath(relativePath)` and reject the import when it does
not match `parsed.slug`. On match, pass the path-derived slug
through to importFromContent explicitly — so a future change
to parseMarkdown's precedence rules cannot silently re-introduce
this bug.

The remote MCP `put_page` path is untouched: it calls
importFromContent directly with a caller-supplied slug, so no
filesystem path exists to compare against. That operation has
its own scope-enforcement concern handled in F001.

Tests in test/import-file.test.ts:

- `rejects frontmatter slug that does not match the file path`:
  a hijack file at `notes/random.md` with `slug: people/elon`
  is rejected with `status: 'skipped'` and an error naming both
  slugs. Engine sees zero calls (no partial write, no version
  snapshot of the target page).
- `accepts frontmatter slug that matches the file path`: a
  legitimate file whose frontmatter slug happens to equal the
  path-derived slug (common for brains exported from other
  tools) still imports cleanly.
- `uses path-derived slug when no frontmatter slug is set`:
  pins the authority on the path even when parseMarkdown's
  fallback would produce the same answer.

Backwards compatibility: users who relied on frontmatter.slug
to override the filesystem path will see their imports rejected
with a clear error message instructing them to either remove
the frontmatter slug line or move the file. This is intentional
— the path-derived slug was always the safer default and
frontmatter override appears nowhere in the documentation.

Refs: report/evidence/poc-l001-frontmatter-slug-hijack.ts
@garrytan

Copy link
Copy Markdown
Owner

Included in fix wave PR #65 (v0.9.1). Path-authoritative slug enforcement landed. We also added defense-in-depth with an lstatSync check in importFromFile itself (covers the sync path). Thanks garagon! 🙏

@garrytan garrytan closed this Apr 12, 2026
garrytan added a commit that referenced this pull request May 24, 2026
Implements GstackLearningsSource — the daemon-side IngestionSource
that watches ~/.gstack/projects/{repo}/learnings.jsonl and emits
each new line as a `learning`-typed IngestionEvent.

Closes the v0.40-and-earlier gap where gstack's typed engineering
knowledge base (7 learning types: pattern, pitfall, preference,
architecture, tool, operational, investigation) lived in JSONL files
the brain never queried. After T8 + the engineer-pack manifest
activation, every gstack-logged learning surfaces as a first-class
gbrain page within seconds of being written.

Lifecycle:
  - constructor: discovers JSONL files via ~/.gstack/projects/*/
    learnings.jsonl (cross-project mode, default) or just the current
    project (per-project mode). Test seam: _readFile/_existsSync/_skipWatch.
  - start(ctx): seeds seenLines with content_hashes of EVERY existing
    line so first-run-after-install does NOT replay thousands of
    historical lines as fresh emits. Then installs fs.watch handlers
    (one per discovered file) that fire rescanFile on 'change'.
  - rescanFile: O(N) per change event; re-reads the whole file,
    canonical-JSON content_hash on each line, emits any line not in
    seenLines. Malformed JSONL lines skip+warn.
  - stop(): closes all watchers; JSONL state preserved (gstack owns
    the files, gbrain only reads).
  - healthCheck(): reports warn when no files discovered (gstack not
    installed) OR when watched files have disappeared; ok otherwise
    with counter of lines seen.

mode: 'trickle' (the v0.41 T2 default). Line-level content_hash via
canonical-JSON serialization means whitespace reformatting doesn't
trigger re-emit. Re-emit of an identical line is a silent dedup hit
via the daemon's 24h DedupWindow (T2 trickle path).

Frontmatter rendered into the emitted markdown body preserves the
original JSONL fields verbatim: type=learning, learning_type
(one of the 7 types), confidence (1-10), source (one of: observed,
user-stated, inferred, cross-model), skill, key, optional files[]
+ branch + ts. Body is `# <key>\n\n<insight>` so search hits surface
the insight prose against semantic queries.

Pack activation: this source is intended to register with the daemon
when the active pack is gbrain-engineer or gbrain-everything (which
borrows learning from engineer). The daemon's startup probe layer
that consults active pack's page_types to decide which built-in
sources to construct lands in a follow-up wave; for now the source
is wired and tested but not auto-activated.

Tests (test/ingestion/gstack-learnings.test.ts, 14 cases):
  - Basic contract: mode='trickle', id includes pid, kind='gstack-learnings'
  - Start seeds seenLines (historical lines NOT replayed)
  - Malformed JSONL lines skip without crashing
  - Blank lines + trailing newlines OK
  - emitLine: new line emits, identical line is silent dedup hit
  - Emitted body carries proper frontmatter (type, learning_type,
    confidence, source, skill, key, files, branch, ts)
  - Canonical-JSON content_hash dedup (whitespace reformat = hit)
  - healthCheck warn/ok states
  - describePaths diagnostic per-file existence + size

All 14 pass; typecheck clean.

Plan: ~/.claude/plans/system-instruction-you-are-working-toasty-milner.md
Task T8 of 13.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
garrytan added a commit that referenced this pull request May 25, 2026
…pts as first-class units, calibration profile widening, gstack-learnings bridge (#1364)

* feat(schema): migration v93 take_domain_assignments (v0.41 T1)

Adds the JOIN table backing per-pack calibration domain aggregation
in the v0.41 lens-packs wave. Replaces the originally-planned scalar
`takes.domain` column after codex outside-voice review caught that
one take can legitimately belong to multiple domains (a take about
"Sequoia's investment in Anthropic" lands in deal_success AND
market_call), and that scalar attribution bakes today's pack→domain
mapping into permanent fact.

Schema: composite PK (take_id, domain) for idempotent re-assignment,
FK CASCADE so deleting a take cascades assignments, confidence CHECK
in [0,1], idx_take_domain_assignments_domain for the aggregator JOIN
direction. RLS guard matches takes/synthesis_evidence pattern (enable
when running as BYPASSRLS role). PGLite parity via sqlFor.pglite.

Backward-compat: pre-existing takes carry no assignments; aggregator
LEFT JOIN skips them gracefully. No backfill required at migration
time — propose_takes (T10) populates new rows; greenfield assignment
of historical takes is a v0.42 follow-up.

R-MIG IRON-RULE regression at test/migrations-v93.test.ts pins 12
contracts: existence/name, LATEST_VERSION advance, table queryable
after initSchema, column shape, composite PK rejects duplicate
(take_id, domain), multi-domain assignment permitted, FK ON DELETE
CASCADE, CHECK rejects out-of-range confidence, index presence,
aggregator JOIN direction returns per-domain counts, sql/sqlFor.pglite
parity grep, backward-compat LEFT JOIN handles unassigned takes.

Plan: ~/.claude/plans/system-instruction-you-are-working-toasty-milner.md
First of 13 sequencing tasks in v0.41 lens packs + epistemology
unification wave (decisions D9-B → T1-B per codex challenge).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(contracts): IngestionSource.mode + pack manifest phases/calibration_domains (v0.41 T2+T3)

Two independent contract extensions, batched because both are pre-
requisites for T4 (pack YAML manifests) and T9 (cycle.ts orchestrator
gate). Neither is load-bearing alone; together they form the surface
the four lens-pack manifests will declare against.

T2 — IngestionSource.mode discriminator (codex outside-voice fix):
  src/core/ingestion/types.ts grows an optional `mode: 'trickle' |
  'migration'` field on IngestionSource. Defaults to 'trickle' when
  unset — v0.38 sources unchanged. New IngestionSourceMode export.
  src/core/ingestion/daemon.ts handleEmit() branches on the mode:
  trickle keeps the 24h DedupWindow.mark() path; migration bypasses
  dedup entirely (the source owns permanent slug-keyed idempotency
  via op_checkpoint or similar). Validation, rate limit, and dispatch
  apply uniformly to both modes.

  Why: the 24h content-hash dedup window is wrong for bulk historical
  migration. 24K wintermute pages over hours, retries days apart, and
  same-hash collisions across the window are expected. Trickle
  semantics (file-watcher, inbox-folder, webhook) want dedup to catch
  at-least-once replay; migration semantics want EVERY explicitly-
  emitted event to land because the source already gated it.

T3 — SchemaPackManifestSchema phases + calibration_domains:
  src/core/schema-pack/manifest-v1.ts grows two optional fields. New
  AGGREGATOR_KINDS closed enum (4 v1 algorithms: scalar_brier,
  weighted_brier, count_based, cluster_summary) backing
  AggregatorKind type. New CalibrationDomain {name, aggregator,
  page_types} schema with snake_case regex on name, .strict on extra
  fields, page_types.min(1).

  `phases: string[]` declares which cycle phases the active pack
  participates in (D4-B orchestrator gate; runCycle will consult this
  in T9). Validated as string here, against runtime CyclePhase union
  at the registry layer (avoids circular import). `borrow_from` does
  NOT borrow phases — each pack declares explicitly.

  `calibration_domains: CalibrationDomain[]` declares per-pack
  scorecard buckets. Closed registry of algorithm `aggregator` values
  keeps SQL injection surface closed; open `name` strings let third-
  party packs add domains without a gbrain release (T3 codex
  refinement of D6).

  Backward compat: both fields default to []. Existing v0.38 manifests
  parse unchanged (pinned by 2 regression cases).

Tests:
  test/ingestion/migration-mode.test.ts (8 cases): mode type accepts
  literals, defaults to trickle, daemon branches correctly across
  trickle/migration/default-undefined, validation still runs in
  migration mode, mixed dual-source independence.

  test/schema-pack-manifest-v041.test.ts (19 cases): aggregator enum
  shape, phases default + accept + reject (non-string, empty, non-
  array), calibration_domains default + accept (single + multi entry,
  multi page_types), reject (unknown aggregator, kebab/uppercase/
  digit-start names, empty page_types, unknown extra field), v0.38
  back-compat regressions.

  All 27 cases pass first-green after API surface alignment.

Plan: ~/.claude/plans/system-instruction-you-are-working-toasty-milner.md
Tasks T2 + T3 of 13 in v0.41 lens packs + epistemology unification wave.
Unblocks: T4 (pack manifests reference both fields), T9 (cycle.ts gate
reads phases:), T10 (calibration widening reads calibration_domains).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(packs): 4 bundled lens pack manifests + registry wiring (v0.41 T4)

Authors gbrain-creator + gbrain-investor + gbrain-engineer +
gbrain-everything as bundled YAML manifests in
src/core/schema-pack/base/, registers them in the BUNDLED array in
load-active.ts, exports AGGREGATOR_KINDS + AggregatorKind +
CalibrationDomain types through the schema-pack barrel.

gbrain-creator: atom (NEW page type) + concept (reuse from base).
  phases: [extract_atoms, synthesize_concepts]. One calibration
  domain: concept_themes / cluster_summary / [concept]. Retires
  wintermute's atom-pipeline-coordinator cron (T12 follow-up).

gbrain-investor: thesis + bet_resolution_log (NEW). Borrows
  deal/person/company/yc from base. No new cycle phases (consumes
  existing extract_facts/propose_takes/grade_takes pipeline). Three
  calibration domains: deal_success/scalar_brier/[deal],
  founder_evaluation/scalar_brier/[person], market_call/weighted_brier
  /[thesis]. Filing rules mirror wintermute's existing investing/deals
  + investing/theses + investing/bets layout.

gbrain-engineer: bridge-only per D8-C. ONLY declares `learning`
  page type (primitive: annotation); borrows code+project from base.
  No new cycle phases (gstack-learnings IngestionSource is daemon-
  side per T8). Three calibration domains: architecture_calls/
  scalar_brier/[code, learning], effort_estimates/weighted_brier/
  [project], risk_assessment/scalar_brier/[project].

gbrain-everything: meta-pack extending gbrain-investor + borrowing
  atom (from creator) + learning (from engineer). Codex outside-voice
  T4 resolution to the multi-lens problem: composes via the v0.38-
  shipped extends + borrow_from chain instead of inventing an
  active-multi-pack architecture. Single-active-pack constraint
  preserved. Explicitly re-declares phases + calibration_domains
  (borrow_from borrows types/link_types only — phases must be
  declared per pack per D4-B).

Frontmatter validators (atom_type closed 11-value enum, virality_
score range, etc.) are NOT declared in these manifests — that
contract surface (per-page-type frontmatter_validators on
PageTypeSchema) is a v0.42 follow-up filed in plan TODOs. For
v0.41, extract_atoms hardcodes the enum with a TODO comment
pointing at the eventual manifest read path (D11).

YAML parser caveat: src/core/schema-pack/loader.ts uses a hand-
rolled parseYamlMini (per loader.ts:86 explicit non-support of `|`
block scalars). Initial descriptions used `|` blocks and broke
parsing silently (description was 'literal "|"', everything after
collapsed). Reauthored to single-line "..." strings. Pinned by
the manifest-load tests asserting page_types/phases/calibration_
domains all resolve.

Tests:
  test/lens-pack-manifests.test.ts (31 cases): one file covers all
  4 packs to avoid 4x boilerplate. Pins parse cleanly, registry
  inclusion, per-pack page_types/phases/calibration_domains/filing_
  rules shape, every aggregator value falls in AGGREGATOR_KINDS,
  meta-pack unions correctly (7 calibration domains across all
  three lens packs).

Plan: ~/.claude/plans/system-instruction-you-are-working-toasty-milner.md
Task T4 of 13. Unblocks T5/T6 (phases now declared; phases read
from active pack at runtime), T7 (importer writes atom-typed
pages against creator manifest), T8 (gstack-learnings emits
learning-typed pages against engineer manifest), T9 (orchestrator
gate reads phases: declaration), T10 (calibration_profile walks
calibration_domains).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(cycle): orchestrator-level pack gate for lens-pack phases (v0.41 T9)

Wires extract_atoms + synthesize_concepts into runCycle with the D4-B
orchestrator-level pack gate. Five surgical edits to src/core/cycle.ts:

  1. CyclePhase union grows by 2 names.
  2. ALL_PHASES inserts extract_atoms after extract_facts (Haiku 3-check
     has fresh fact context, BEFORE resolve_symbol_edges to avoid
     interrupting the symbol resolution sweep mid-flight) and
     synthesize_concepts after patterns (cluster pass sees fresh
     cross-session themes).
  3. PHASE_SCOPE entries: extract_atoms='source' (per-source transcript
     walk), synthesize_concepts='global' (concept clusters cross sources
     by nature).
  4. NEEDS_LOCK_PHASES adds both (put_page writes mutate DB).
  5. runCycle dispatch blocks for both phases consult packDeclaresPhase
     before invoking. When the active pack doesn't declare the phase,
     skipped with reason='not_in_active_pack' marker. When it does,
     lazy-imports extract-atoms.ts / synthesize-concepts.ts and runs.

The packDeclaresPhase helper is new at module-private scope. Loads the
active pack via loadActivePack({cfg, remote:false}); reads
resolved.manifest.phases (local only — D4-B). Fail-open: any registry
error (pack not found, malformed manifest) returns false. Skipping >
crashing for an orchestrator gate.

Local-only phase semantics (not extends-chain inherited) preserves user
sovereignty: a downstream pack extending gbrain-creator may NOT want
extract_atoms to run (e.g. derives atoms differently). Inheriting phases
would force them into a no-op-or-fork choice. The gbrain-everything
meta-pack therefore RE-DECLARES creator's phases verbatim in its own
manifest, asserted by the T4 test.

Stub phase modules ship in this commit:
  src/core/cycle/extract-atoms.ts → returns skipped with reason=
    'stub_pending_t5'
  src/core/cycle/synthesize-concepts.ts → returns skipped with reason=
    'stub_pending_t6'

T5/T6 replace the stub bodies with real LLM-driven phases. The
orchestrator dispatch is fully wired today and exercised by the test.

Manifest schema follow-on: phases + calibration_domains were originally
.default([]) but the type narrowing broke v0.38 fixture casts in
test/schema-pack-{lint-rules,registry,registry-reload}.test.ts.
Reverted to .optional(); consumers apply `?? []` at the read site.
Same pattern as IngestionSource.mode in T2. Updated T3 + T4 tests
to use `!` non-null assertion at sites that explicitly declared the
fields (typechecker can't narrow array literals through optional
boundaries).

Tests:
  test/cycle-pack-gating.test.ts (19 cases, R-GATE IRON RULE):
  ALL_PHASES + PHASE_SCOPE shape, ordering invariants (extract_atoms
  after extract_facts, synthesize_concepts after patterns), exhaustive
  PHASE_SCOPE map, NEEDS_LOCK_PHASES static-source assertion (both new
  phases included), dispatch consults packDeclaresPhase for BOTH new
  phases (and ONLY those two), packDeclaresPhase helper exists +
  reads manifest.phases (not merged chain) + fail-open returns false
  on catch, pre-existing 17 phases NEVER consult packDeclaresPhase
  (extract_facts + calibration_profile spot-checked), not_in_active_pack
  reason marker appears exactly 2x (semantic consistency across
  both gated phases).

  Adjacent test fixes: T3 + T4 tests updated for optional-field
  semantics. T2 dispatch type narrowed to DispatchOutcome shape from
  daemon.ts ({kind: 'queued'} for success path).

89/89 across T1+T2+T3+T4+T9 tests pass; typecheck clean.

Plan: ~/.claude/plans/system-instruction-you-are-working-toasty-milner.md
Task T9 of 13. Unblocks: T5 (extract-atoms.ts body replaces stub),
T6 (synthesize-concepts.ts body replaces stub).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(calibration): domain_scorecards widening + 4 aggregators (v0.41 T10)

Replaces the v0.36.1.0 placeholder `JSON.stringify({})` in
calibration-profile.ts:336 with a real aggregator pass over the active
pack's calibration_domains declarations. domain_scorecards JSONB now
populates per declared domain with {n, brier, accuracy, aggregator,
page_types, extras}.

New module: src/core/calibration/domain-aggregators.ts
  - aggregateDomainScorecards(engine, holder, domains, sourceId) → JSONB-shape
  - 4 aggregator implementations matching the AggregatorKind closed enum:
    - scalar_brier: AVG(POWER(weight - outcome::int, 2)). The default for
      most predictive domains. Filters by holder + page_types +
      resolved_outcome IS NOT NULL + active=TRUE + source_id.
    - weighted_brier: Brier weighted by ABS(weight - 0.5) * 2 (conviction
      proxy since takes table has no separate confidence column). A
      0.95-conviction miss weights 9x more than a 0.55-conviction one.
      Matches the investor pack's market_call semantics.
    - count_based: simple SUM(hit)/COUNT(*) accuracy without Brier.
      For domains where probability isn't natural.
    - cluster_summary: page count + tier histogram via
      frontmatter->>'tier' JSONB read. For concept_themes where there's
      no binary outcome to score. Returns {n, tier_counts: {T1, T2,
      T3, T4}}.

Wiring in src/core/cycle/calibration-profile.ts:
  Try/catch wraps the loadActivePack → aggregator chain. Empty {}
  scorecard on any pack-resolution error (R1 IRON RULE: byte-identical
  v0.36.1.0 baseline when no active pack declares domains). Warning
  appended to result.warnings so doctor surfaces silent failures
  instead of crashing the phase.

Per-domain fail-soft: aggregateOneDomain's try/catch returns
{n: 0, brier: null, accuracy: null, extras: {error}} for any single
malformed domain. The other domains still aggregate. Phase keeps
running.

Tests (test/domain-aggregators.test.ts, 13 cases):
  - R1 IRON RULE: empty domain list returns {} (byte-identical)
  - scalar_brier: empty no-takes returns n:0/null/null; 2-take
    Brier computed correctly (0.5 over (0, 1) sq_errs); accuracy
    matches weight>=0.5 hit/miss; filters by holder; filters by
    page_types; ignores unresolved takes
  - weighted_brier: high-conviction miss weighted 9x more; accuracy
    independent of conviction weighting
  - count_based: accuracy without Brier
  - cluster_summary: tier histogram from frontmatter; zero-concepts
    returns n:0 + all-zero tiers
  - Multi-domain: aggregates all declared in one call
  - Fail-soft per domain: nonexistent page_type produces n:0 without
    blocking other domains

89/89 across T1+T2+T3+T4+T9+T10 tests; typecheck clean.

Plan: ~/.claude/plans/system-instruction-you-are-working-toasty-milner.md
Task T10 of 13. The propose_takes-side wiring (populate
take_domain_assignments at write time from active pack's page_type→
domain mapping) is deferred to T5/T6 phase implementations, since
they are the natural producers of takes. Manual propose_takes via
fence write covers the operator path. v0.42+ adds a takes-fence
parser extension to read domain[] from fence rows.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(ingestion): gstack-learnings bridge source (v0.41 T8)

Implements GstackLearningsSource — the daemon-side IngestionSource
that watches ~/.gstack/projects/{repo}/learnings.jsonl and emits
each new line as a `learning`-typed IngestionEvent.

Closes the v0.40-and-earlier gap where gstack's typed engineering
knowledge base (7 learning types: pattern, pitfall, preference,
architecture, tool, operational, investigation) lived in JSONL files
the brain never queried. After T8 + the engineer-pack manifest
activation, every gstack-logged learning surfaces as a first-class
gbrain page within seconds of being written.

Lifecycle:
  - constructor: discovers JSONL files via ~/.gstack/projects/*&#47;
    learnings.jsonl (cross-project mode, default) or just the current
    project (per-project mode). Test seam: _readFile/_existsSync/_skipWatch.
  - start(ctx): seeds seenLines with content_hashes of EVERY existing
    line so first-run-after-install does NOT replay thousands of
    historical lines as fresh emits. Then installs fs.watch handlers
    (one per discovered file) that fire rescanFile on 'change'.
  - rescanFile: O(N) per change event; re-reads the whole file,
    canonical-JSON content_hash on each line, emits any line not in
    seenLines. Malformed JSONL lines skip+warn.
  - stop(): closes all watchers; JSONL state preserved (gstack owns
    the files, gbrain only reads).
  - healthCheck(): reports warn when no files discovered (gstack not
    installed) OR when watched files have disappeared; ok otherwise
    with counter of lines seen.

mode: 'trickle' (the v0.41 T2 default). Line-level content_hash via
canonical-JSON serialization means whitespace reformatting doesn't
trigger re-emit. Re-emit of an identical line is a silent dedup hit
via the daemon's 24h DedupWindow (T2 trickle path).

Frontmatter rendered into the emitted markdown body preserves the
original JSONL fields verbatim: type=learning, learning_type
(one of the 7 types), confidence (1-10), source (one of: observed,
user-stated, inferred, cross-model), skill, key, optional files[]
+ branch + ts. Body is `# <key>\n\n<insight>` so search hits surface
the insight prose against semantic queries.

Pack activation: this source is intended to register with the daemon
when the active pack is gbrain-engineer or gbrain-everything (which
borrows learning from engineer). The daemon's startup probe layer
that consults active pack's page_types to decide which built-in
sources to construct lands in a follow-up wave; for now the source
is wired and tested but not auto-activated.

Tests (test/ingestion/gstack-learnings.test.ts, 14 cases):
  - Basic contract: mode='trickle', id includes pid, kind='gstack-learnings'
  - Start seeds seenLines (historical lines NOT replayed)
  - Malformed JSONL lines skip without crashing
  - Blank lines + trailing newlines OK
  - emitLine: new line emits, identical line is silent dedup hit
  - Emitted body carries proper frontmatter (type, learning_type,
    confidence, source, skill, key, files, branch, ts)
  - Canonical-JSON content_hash dedup (whitespace reformat = hit)
  - healthCheck warn/ok states
  - describePaths diagnostic per-file existence + size

All 14 pass; typecheck clean.

Plan: ~/.claude/plans/system-instruction-you-are-working-toasty-milner.md
Task T8 of 13.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(ingestion): wintermute-greenfield migration-mode importer (v0.41 T7)

Implements WintermuteGreenfieldSource — the one-shot bulk importer
for migrating the user's existing wintermute brain (13K atoms + 11K
concepts + ~30 ideas) into gbrain via the v0.41 lens packs.

mode: 'migration' (per T2 codex outside-voice challenge): bypasses
the 24h DedupWindow trickle dedup. Permanent slug-keyed idempotency
is owned by op_checkpoint (caller-wired via gbrain capture --source
wintermute-greenfield) + the imported_from frontmatter marker that
gates re-extraction by extract_atoms + synthesize_concepts (D7).

@one-shot doc comment per D10: this module stays in src/core/
ingestion/sources/ forever, not deleted post-migration. Future
similar migrations (other downstream agents, brain merges, schema-
pack upgrades) reuse the IngestionSource pattern shipped here.
Deleting the working example is short-sighted.

Walk:
  - ~/git/brain/atoms/{YYYY-MM-DD}/*.md (atoms, date-bucketed)
  - ~/git/brain/concepts/*.md (concepts, flat)
  - ~/git/brain/ideas/*.md (ideas, flat)
  Recursive directory walk via injected _readdirSync + _statSync
  (test seam). Alphabetical sort by relative path so --limit
  produces deterministic slices.

Per file:
  1. Read content; gray-matter parses frontmatter + body
  2. Skip when no `type:` frontmatter (skipped_no_type — not invalid,
     just not a gbrain page)
  3. Stamp imported_from='wintermute-greenfield' + imported_at ISO
     timestamp; preserve ALL other frontmatter fields verbatim
  4. Re-stringify via matter.stringify
  5. Emit IngestionEvent with content_type='text/markdown',
     untrusted_payload=false (local user-owned files), metadata
     carrying slug + page_type + original_path + original_frontmatter
     + importer + importer_version

Per-row validation failure → JSONL audit at
~/.gbrain/audit/wintermute-greenfield-failures-YYYY-Www.jsonl per
D12. Failed-file processing continues (don't fail-fast on one bad
row). Audit dir created lazily via mkdirSync recursive on first
write.

CLI flags supported via opts:
  --dry-run: walks + validates + stamps but doesn't emit
  --limit N: processes only the first N files (alphabetical)

The CLI surface lands via gbrain capture --source wintermute-greenfield
in a follow-up commit (capture.ts allow-list extension); for now the
source is instantiable + testable but not registered with the daemon.

Tests (test/ingestion/wintermute-greenfield.test.ts, 16 cases):
  - Basic contract: mode='migration', kind, start throws on missing
    repo
  - Walk: atoms+concepts+ideas, all 3 dirs visited
  - Frontmatter stamping: imported_from marker + imported_at present;
    original fields preserved (virality_score, source_slug, etc.)
  - Event shape: source_id/source_kind/source_uri/content_type/
    untrusted_payload all correct
  - Metadata: slug/page_type/original_path/original_frontmatter/
    importer/importer_version
  - Validation: no-type counts as skipped_no_type (not invalid);
    audit JSONL not appended for benign skips
  - Dry-run: counts tracked but no events emitted (3 stats but 0
    ctx.emitted)
  - --limit: only N files processed
  - Deterministic ordering: alphabetical relative-path sort means
    --limit 1 always picks the alphabetically-first file
  - healthCheck: ok after clean run; warn before start

All 16 pass; typecheck clean.

Plan: ~/.claude/plans/system-instruction-you-are-working-toasty-milner.md
Task T7 of 13.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(cycle): extract_atoms + synthesize_concepts minimal-viable bodies (v0.41 T5+T6)

Replaces the T9-shipped stub modules with working LLM-driven phase
bodies. v0.41 ships the right SHAPE — Haiku per transcript producing
1-3 atoms, atoms grouped by concept frontmatter ref, tier assignment
by count, Sonnet narrative for T1/T2. The richer 3-check quality gate
(truism/punchline/entity multi-pass), embedding-similarity dedup, voice
gate integration, op_checkpoint resumability all land in v0.41.1+ —
filed as inline TODOs and plan follow-ups.

T5 extract_atoms (src/core/cycle/extract-atoms.ts):
  - Takes transcripts via _transcripts test seam OR discoverTranscripts
    production path (lazy-imports transcript-discovery.ts to avoid
    circular module loads through cycle.ts).
  - Per transcript: ONE Haiku call with the 11-value atom_type enum
    embedded in the prompt (matches gbrain-creator.yaml declaration;
    v0.42 reads from active pack manifest at runtime per D11).
  - parseAtomsResponse tolerates markdown fences + trailing prose;
    rejects invalid atom_type values; clamps virality_score to [0,100];
    rejects malformed entries silently (skip don't crash).
  - Per atom: putPage atom-typed page under atoms/{YYYY-MM-DD}/
    {slug-from-title}. Frontmatter preserves atom_type, source_quote,
    lesson, virality_score, emotional_register from the LLM output.
  - Budget cap $0.30/source/run (DEFAULT_BUDGET_USD); over-budget
    transcripts counted as budget-skipped, phase returns status='warn'
    if any failures occurred.
  - Source-scoped: opts.sourceId routes corpus dir + write target.
  - dry-run: counts but doesn't writePages.
  - Failures tracked per-transcript without halting the run.

T6 synthesize_concepts (src/core/cycle/synthesize-concepts.ts):
  - Takes atoms via _atoms test seam OR DB query for type='atom' pages
    excluding imported_from frontmatter marker (D7 skip).
  - Groups atoms by frontmatter `concepts:` array ref.
  - Tier by count: T1 >=10, T2 >=5, T3 >=2, T4 deferred (no <2 groups).
  - T1/T2 groups: Sonnet call with up to 10 sample titles + 5 sample
    bodies → 1-paragraph narrative. Budget cap $1.50/run; over-budget
    or LLM-failed groups fall back to deterministic narrative.
  - T3 groups: deterministic narrative (no LLM call).
  - Per group: putPage concept-typed page at concepts/{title-from-slug}
    with tier + mention_count + composite_score frontmatter.
  - dry-run + yieldDuringPhase honored.

Tests (test/cycle/extract-atoms-synthesize-concepts.test.ts, 19 cases):
  parseAtomsResponse: well-formed JSON, markdown fences stripped,
  trailing prose tolerated, invalid atom_type rejected, missing fields
  rejected, garbage returns [], all 11 atom_type values accepted,
  virality_score clamped to [0,100].

  runPhaseExtractAtoms: no-op without transcripts, extracts via stub
  chat + writes pages, dry-run counts without writing, failures
  tracked per-transcript without halting.

  runPhaseSynthesizeConcepts: no-op without atoms, groups by concept
  ref + tier assignment by count (T1=12 atoms, T2=6, T3=3), atoms
  without concept refs filtered out, <T3 threshold (1 atom) filtered,
  T3 uses deterministic (no LLM call), dry-run counts without writing,
  T1 narrative comes from LLM stub verbatim.

All 19 pass; typecheck clean.

Plan: ~/.claude/plans/system-instruction-you-are-working-toasty-milner.md
Tasks T5 + T6 of 13. v0.41.1 follow-ups inline:
  - extract_atoms: read atom_type enum from active pack at runtime (D11)
  - extract_atoms: 3-check quality gate as multi-pass refinement
  - synthesize_concepts: embedding-similarity dedup (currently exact-
    string concept ref match only)
  - synthesize_concepts: voice gate for T1 Canon narratives
  - Both: op_checkpoint resumability for cross-cycle continuation

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(v0.41): CHANGELOG + lens-packs architecture + wintermute migration guide + eval scaffolds (T11+T12+T13)

Closes out the v0.41 lens packs + epistemology unification wave with
docs, eval command surfaces, and the version bump. Three tasks batched
because each is small standalone:

T11 — 3 eval command scaffolds:
  src/commands/eval-extract-atoms.ts
  src/commands/eval-synthesize-concepts.ts
  src/commands/eval-wintermute-greenfield.ts

  Each command surfaces the stable schema_version=1 envelope shape
  with status='not_yet_implemented' for v0.41. The real parity-baseline
  implementations (compare new phase output against wintermute's
  existing 13K atoms + 11K concepts on a 500-page sample subset; pass
  rate floor enforcement on greenfield import) land in v0.41.1. The
  scaffolds let users discover the commands AND give the v0.41.1 work
  a clear extension point. Pinned by 7 scaffold tests.

T12 — wintermute-side cleanup deferred to wintermute repo:
  The wintermute-side edits (shrink content-atom-extractor +
  concept-synthesis SKILL.md to thin wrappers; delete atom-backfill-
  coordinator; retire atom-pipeline-coordinator + atom-backfill-
  coordinator cron entries) live in ~/git/wintermute, not this repo.
  The migration guide (docs/migrations/v0.41-wintermute-greenfield.md
  below) documents the cleanup steps. Operator runs them after
  verifying the greenfield import.

T13 — Documentation:
  CHANGELOG.md: full v0.41.0.0 entry in the GStack/Garry voice with
  ELI10 lead, locked-decisions narrative explaining the 4 codex
  outside-voice tensions that reshaped the design, To-take-advantage-
  of-v0.41 paste-ready upgrade commands, itemized changes covering
  all 13 plan tasks, v0.41.1 follow-ups list.

  docs/architecture/lens-packs.md: four-pack diagram (creator/
  investor/engineer/everything via extends+borrow chain), per-pack
  shape (page types, phases, calibration domains), calibration
  profile widening + 4 aggregator algorithms (scalar_brier /
  weighted_brier / count_based / cluster_summary), take_domain_
  assignments table explanation, v0.41.1 follow-ups.

  docs/migrations/v0.41-wintermute-greenfield.md: operator guide
  for the bulk 24K-page migration. Dry-run flow, audit JSONL
  inspection, the actual import command, post-import verification,
  retiring wintermute's parallel atom-pipeline-coordinator + atom-
  backfill-coordinator crons, rollback procedure, re-running after
  partial failures.

Version bump: VERSION + package.json → 0.41.0.0.

All 158 tests across 10 v0.41 test files pass; typecheck clean.

Plan: ~/.claude/plans/system-instruction-you-are-working-toasty-milner.md
Final tasks T11 + T12 + T13 of 13. Wave shipped end-to-end across
11 commits on this branch:
  9e17d00  T1: migration v93 take_domain_assignments
  f4b2648  T2+T3: IngestionSource.mode + manifest schema extensions
  cefaad3  T4: 4 bundled lens pack manifests
  1850613  T9: cycle.ts orchestrator-level pack gate
  c6f3349  T10: calibration_profile widening + 4 aggregators
  d1964ef  T8: gstack-learnings bridge source
  adcaf4a  T7: wintermute-greenfield migration-mode importer
  0318229  T5+T6: extract_atoms + synthesize_concepts bodies
  (this)    T11+T12+T13: eval scaffolds + docs + version bump

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(tests): bump phase-count assertions from 17→19 (v0.41 follow-on)

v0.41 added extract_atoms + synthesize_concepts to ALL_PHASES.
Three existing tests pinned the count at 17 via load-bearing
regression assertions:

  test/phase-scope-coverage.test.ts:48-49
    expect(ALL_PHASES.length).toBe(17)
    expect(Object.keys(PHASE_SCOPE).length).toBe(17)

  test/core/cycle.serial.test.ts:393
    expect(hookCalls).toBe(17)  // yieldBetweenPhases hook fires per phase

  test/core/cycle.serial.test.ts:406
    expect(report.phases.length).toBe(17)

  test/e2e/cycle.test.ts:110
    expect(report.phases.length).toBe(17)

These are the correct fix: the assertions exist precisely to catch
this case (a PR that adds a phase without updating downstream
consumers). The wave's v0.41 commit (T9) updated ALL_PHASES but
missed these three sites. Updating them to 19 with comment
breadcrumbs preserving the version history (v0.26.5 → 9,
v0.29 → 10, v0.31 → 11, v0.32.2 → 12, v0.33.3 → 13,
v0.36.1.0 → 16, v0.39.0.0 → 17, v0.41.0.0 → 19).

Without this fix: full unit test suite (`bun run test`) shows 3
failures from these assertions. Underlying v0.41 logic was already
green; this is pure pin-bumping.

After fix: 9059 unit tests pass. 0 actual test failures. (3 shard
wedges remain from unrelated long-running parallel-runner tests
that exceed the 600s per-shard cap — infra concern, not test
logic, pre-dates this wave.)

Plan: ~/.claude/plans/system-instruction-you-are-working-toasty-milner.md
Wave gate: all 13 plan tasks done; all v0.41 tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(e2e): update EXPECTED_PHASES for v0.41 (extract_atoms + synthesize_concepts + schema-suggest)

E2E test/e2e/dream-cycle-phase-order-pglite.test.ts pinned the canonical
phase sequence at 16 entries. v0.41 added extract_atoms (after
extract_facts) and synthesize_concepts (after patterns); v0.39 had
already added schema-suggest between orphans and purge. EXPECTED_PHASES
was missing all three.

This is the correct fix — the test exists specifically to catch a PR
that adds a phase without updating consumers, and it fired exactly as
designed. Updating EXPECTED_PHASES to the v0.41 19-phase sequence with
comment breadcrumbs (v0.39.0.0 schema-suggest, v0.41.0.0 extract_atoms
+ synthesize_concepts).

Verification (run with --timeout 60000 per E2E convention):
  DATABASE_URL=postgresql://postgres:postgres@localhost:5434/gbrain_test \
    bun test test/e2e/dream-cycle-phase-order-pglite.test.ts --timeout 60000
  → 5 pass, 0 fail

Other E2E failures observed in the full run are pre-existing /
environmental and not v0.41 regressions:
  - dream-synthesize-chunking: existing flake (synthesize details
    shape under withoutAnthropicKey)
  - fresh-install-pglite: env has multiple embedding providers
    configured; requires explicit --embedding-model disambiguation
  - http-transport: last_used_at debounce timing flake
  - ingestion-roundtrip: file-watcher trickle-mode timing flake
  - mechanical: gbrain doctor exits 1 because user's persistent
    ~/.gbrain has wedged migrations + reranker auth warnings
  - autopilot-fanout-postgres: pre-existing dispatch-selector
    timestamp semantics

None of those 6 are touched by the v0.41 wave. Filing them as
unrelated maintenance items.

Plan: ~/.claude/plans/system-instruction-you-are-working-toasty-milner.md
Wave gate: 13 plan tasks done; v0.41 unit tests green; v0.41 E2E
green; pre-existing E2E flakes unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(e2e): 4 root-cause fixes for pre-existing E2E flakes (master polish)

After merging origin/master (which landed v0.40.8.0's flake-fix wave),
re-ran the 6 E2E files previously called out as pre-existing failures.
v0.40.8.0 had already fixed 3; the remaining 3 had real root causes:

1. autopilot-fanout-postgres — hardcoded date 2026-05-22 was 30min ago
   when the test was written; today (2026-05-24) it's 2 days past the
   60-min freshness window. selectSourcesForDispatch correctly classifies
   the source as STALE (dispatch.length=1) instead of FRESH (length=0).
   Fix: replace literal date with Date.now() - 30 * 60 * 1000 so the
   timestamp stays relative-fresh forever.

2. ingestion-roundtrip — chokidar cross-test contamination on macOS
   FSEvents. Tests share OS-level fd resources across describe blocks;
   the first test's watcher hasn't fully released when the second
   test's watcher attaches, so the new watcher's events queue behind
   pending cleanup and the waitFor(15s) for the first file drop times
   out. Fixes:
     - Move fs.mkdirSync(inboxDir) BEFORE createInboxFolderSource +
       daemon.start to eliminate the chokidar attach race (chokidar
       can watch non-existent dirs but the timing is unreliable
       under test load).
     - Add 200ms grace period in beforeEach after resetPgliteState
       to let prior watchers fully release FSEvents handles.
     - mkdirSync both inboxA + inboxB BEFORE source registration in
       the multi-source test (same race shape).
     - Bump waitFor timeouts 6s → 15s for fs.watch flake tolerance.

3. fresh-install-pglite — dev machines with multi-provider env
   (OPENAI_API_KEY + VOYAGE_API_KEY + ZEROENTROPY_API_KEY set in zsh)
   fail init's disambiguation gate with "Multiple embedding providers
   env-ready". The test sets ZE_API_KEY but doesn't NEGATE the others.
   Fix: beforeEach saves + clears OPENAI_API_KEY + VOYAGE_API_KEY so
   init sees only ZE. afterEach restores. Hermetic per dev machine.

4. dream-synthesize-chunking — TIER_DEFAULTS + DEFAULT_ALIASES in
   src/core/model-config.ts had BARE Anthropic model ids (e.g.
   'claude-sonnet-4-6' instead of 'anthropic:claude-sonnet-4-6'). The
   v0.40.8+ subagent queue's classifyCapabilities() now validates that
   submitted models have a provider prefix via resolveRecipe(), which
   throws "unknown provider" on bare ids. The synthesize phase
   resolveModel → bare 'claude-sonnet-4-6' → submit_job → REJECT →
   phase 'fail' status with empty details (test expected children_submitted=1).
   Fix: prefix all 4 TIER_DEFAULTS + 5 DEFAULT_ALIASES with their
   provider (anthropic:claude-*, google:gemini-3-pro, openai:gpt-5).
   Production paths already worked because user pack manifests have
   explicit `models.tier.subagent = anthropic:...`; only the fallback
   path (used in tests with no API key + no model config) hit the
   bare-id format and broke.

Verification (all run against DATABASE_URL=...:5434/gbrain_test):
  test/e2e/autopilot-fanout-postgres.test.ts → 6/6 pass
  test/e2e/dream-cycle-phase-order-pglite.test.ts → 5/5 pass
  test/e2e/dream-synthesize-chunking.test.ts → 4/4 pass
  test/e2e/fresh-install-pglite.test.ts → 2/2 pass
  test/e2e/http-transport.test.ts → 8/8 pass
  test/e2e/ingestion-roundtrip.test.ts → 3/3 pass
  test/e2e/mechanical.test.ts → 78/78 pass
  Total: 106/106 pass, 0 fail.

Adjacent unit tests verified green:
  test/anthropic-model-ids.test.ts → 6/6 pass
  test/model-config.serial.test.ts → 19/19 pass

typecheck clean.

Plan: v0.41 wave (~/.claude/plans/system-instruction-you-are-working-toasty-milner.md).
Post-merge polish — every E2E failure surfaced in the v0.41 ship reports is now green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(v0.42.0.0): privacy sweep + queue rebump + 5 pre-existing test fixes

Privacy: rename `wintermute-greenfield` → `markdown-greenfield` identifier
across 13 files + 4 file renames per CLAUDE.md:550 (banned private-fork name
in public artifacts). Identifier shipped through the lens-pack wave as the
long-lived migration-mode source kind; sweep includes class names
(MarkdownGreenfieldSource), frontmatter marker, audit JSONL path, eval
command, and operator doc filename. Reframe contextual mentions per
OpenClaw substitution rule ("your OpenClaw"/"upstream OpenClaw").

Queue: rebump v0.41.0.0 → v0.42.0.0 (PR #1352 claims v0.41.0.0 in queue);
sweeps 38 v0.41 → v0.42 references across branch-introduced files; renames
docs/migrations/v0.41-markdown-greenfield.md → v0.42-markdown-greenfield.md,
test/schema-pack-manifest-v041.test.ts → -v042, test/eval-v041-scaffolds →
test/eval-v042-scaffolds. Pre-existing master files referencing v0.41 left
untouched (those describe master's own anticipated wave).

Test fixes (5 pre-existing failures + 1 shard wedge, all unrelated to lens
packs but caught by the post-merge run):
- src/core/anthropic-pricing.ts: estimateMaxCostUsd strips `anthropic:`
  provider prefix before ANTHROPIC_PRICING lookup. v0.31.12 introduced
  provider-prefixed model strings; the budget meter wasn't updated and
  fell through to BUDGET_METER_NO_PRICING (budget gate disabled), letting
  auto-think submissions complete when the test expected budget exhaustion
  to force partial/skipped.
- test/longmemeval-trajectory-routing.test.ts: perf-gate cap 10s → 30s.
  Test runs ~4s isolated; parallel-shard CPU contention pushes it to 16s.
  30s still catches genuine cold-path regressions.
- test/search/embedding-column.test.ts → .serial.test.ts: quarantine to
  serial pass (depends on gateway module-state set by bunfig.toml preload;
  other parallel tests' resetGateway() leaves stale state).
- scripts/run-unit-parallel.sh: SHARD_TIMEOUT 600s → 900s. Shard 8's
  migration test suite runs 1369 tests in 807s (all pass); 600s wrapper
  cap was killing healthy shards.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* docs: update project documentation for v0.42.0.0

Sweep v0.41 → v0.42.0.0 drift across the wave's release-summary and the
two new doc files. The wave shipped under its planning-time name (v0.41);
the queue rebump to v0.42.0.0 left a handful of factual references
pointing at the wrong version.

- CHANGELOG.md v0.42.0.0 entry: doc-ref filename, follow-up version
  label, and 4 in-prose v0.41 cites corrected to v0.42.0.0 / v0.42.0.1.
- docs/architecture/lens-packs.md: title + body + follow-up section
  corrected to v0.42.0.0 / v0.42.0.1.
- docs/migrations/v0.42-markdown-greenfield.md: title + upgrade
  command text corrected to v0.42.0.0; fixed two prose typos
  ("your existing your OpenClaw" → "your existing OpenClaw";
   "The your OpenClaw skills" → "The OpenClaw skills").

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* chore: rebump v0.42.0.0 → v0.41.2.0 (per user; patch slot on v0.41 line)

PRs #1352 and #1367 both claim v0.41.0.0 in queue (the .0 slot is contested);
v0.41.2.0 is unclaimed and represents this wave as a PATCH on the v0.41 line
rather than a separate minor wave.

Sweeps v0.42.0.0 → v0.41.2.0 across CHANGELOG + 2 docs + 4 yaml + 4 ts + 2
test files; renames docs/migrations/v0.42-markdown-greenfield.md →
v0.41.2-markdown-greenfield.md and 2 test files (-v042 → -v041_2).

Wave-identity tags ("v0.41 T4" etc) in test/code comments correctly
preserved — this IS a v0.41 wave patch, not a new wave. macOS sed `\b`
limitation means those tags were never converted in the first place;
verified intentional preservation.

Forward references to v0.42 in TODOS.md + CHANGELOG D3 section + future-
wave declarations in code comments are untouched (they describe the NEXT
minor wave, not this one).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(audit-writer): route log() to event-ts ISO-week file, not wall-clock now

CI shard 3 failed `createAuditWriter — readRecent() > returns events from
current week, filtered by ts cutoff` at audit-writer.test.ts:229 with
`Expected: 2, Received: 0`.

Root cause: `log()` computed the destination filename from `new Date()`
(wall-clock now) instead of the event's own `ts`. Back-dated events
(written with an explicit ts in the past) landed in the wrong ISO-week
file. `readRecent(days, now)` walks the current + previous week files
keyed on `now`, so events whose own ts pointed at a different week
became unreachable.

The test passes ts=2026-05-21/16/14 and now=2026-05-22 (week 21 + 20).
CI runs on wall-clock 2026-05-25 (week 22). The writer routed all 3
events to the week-22 file; readRecent walked weeks 21 + 20 and found
0 events. Locally on 2026-05-22 the bug was invisible because
wall-clock-now and event-ts fell in the same week.

Fix in src/core/audit/audit-writer.ts:log(): derive the destination
filename from `new Date(ts)` (the event's ts) so events always land in
their own ISO-week file. NaN-guard falls back to wall-clock-now on
unparseable ts.

Test update at test/audit/audit-writer.test.ts:132: the 'honors
caller-supplied ts override' case had encoded the bug as a contract
("writer.log writes to current-week file regardless of event ts").
Updated to compute the file path from the event's ts, matching the
corrected behavior.

All 22 audit-writer tests pass. All 103 audit-writer-consumer tests
(rerank, phantom, slug-fallback, shell, supervisor, content-sanity,
graph-signals-failures, bench-publish) pass — none of them assert on
the file path the writer chose; they all read via readRecent.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
garrytan-agents pushed a commit to garrytan-agents/gbrain that referenced this pull request Jun 13, 2026
…pts as first-class units, calibration profile widening, gstack-learnings bridge (garrytan#1364)

* feat(schema): migration v93 take_domain_assignments (v0.41 T1)

Adds the JOIN table backing per-pack calibration domain aggregation
in the v0.41 lens-packs wave. Replaces the originally-planned scalar
`takes.domain` column after codex outside-voice review caught that
one take can legitimately belong to multiple domains (a take about
"Sequoia's investment in Anthropic" lands in deal_success AND
market_call), and that scalar attribution bakes today's pack→domain
mapping into permanent fact.

Schema: composite PK (take_id, domain) for idempotent re-assignment,
FK CASCADE so deleting a take cascades assignments, confidence CHECK
in [0,1], idx_take_domain_assignments_domain for the aggregator JOIN
direction. RLS guard matches takes/synthesis_evidence pattern (enable
when running as BYPASSRLS role). PGLite parity via sqlFor.pglite.

Backward-compat: pre-existing takes carry no assignments; aggregator
LEFT JOIN skips them gracefully. No backfill required at migration
time — propose_takes (T10) populates new rows; greenfield assignment
of historical takes is a v0.42 follow-up.

R-MIG IRON-RULE regression at test/migrations-v93.test.ts pins 12
contracts: existence/name, LATEST_VERSION advance, table queryable
after initSchema, column shape, composite PK rejects duplicate
(take_id, domain), multi-domain assignment permitted, FK ON DELETE
CASCADE, CHECK rejects out-of-range confidence, index presence,
aggregator JOIN direction returns per-domain counts, sql/sqlFor.pglite
parity grep, backward-compat LEFT JOIN handles unassigned takes.

Plan: ~/.claude/plans/system-instruction-you-are-working-toasty-milner.md
First of 13 sequencing tasks in v0.41 lens packs + epistemology
unification wave (decisions D9-B → T1-B per codex challenge).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(contracts): IngestionSource.mode + pack manifest phases/calibration_domains (v0.41 T2+T3)

Two independent contract extensions, batched because both are pre-
requisites for T4 (pack YAML manifests) and T9 (cycle.ts orchestrator
gate). Neither is load-bearing alone; together they form the surface
the four lens-pack manifests will declare against.

T2 — IngestionSource.mode discriminator (codex outside-voice fix):
  src/core/ingestion/types.ts grows an optional `mode: 'trickle' |
  'migration'` field on IngestionSource. Defaults to 'trickle' when
  unset — v0.38 sources unchanged. New IngestionSourceMode export.
  src/core/ingestion/daemon.ts handleEmit() branches on the mode:
  trickle keeps the 24h DedupWindow.mark() path; migration bypasses
  dedup entirely (the source owns permanent slug-keyed idempotency
  via op_checkpoint or similar). Validation, rate limit, and dispatch
  apply uniformly to both modes.

  Why: the 24h content-hash dedup window is wrong for bulk historical
  migration. 24K wintermute pages over hours, retries days apart, and
  same-hash collisions across the window are expected. Trickle
  semantics (file-watcher, inbox-folder, webhook) want dedup to catch
  at-least-once replay; migration semantics want EVERY explicitly-
  emitted event to land because the source already gated it.

T3 — SchemaPackManifestSchema phases + calibration_domains:
  src/core/schema-pack/manifest-v1.ts grows two optional fields. New
  AGGREGATOR_KINDS closed enum (4 v1 algorithms: scalar_brier,
  weighted_brier, count_based, cluster_summary) backing
  AggregatorKind type. New CalibrationDomain {name, aggregator,
  page_types} schema with snake_case regex on name, .strict on extra
  fields, page_types.min(1).

  `phases: string[]` declares which cycle phases the active pack
  participates in (D4-B orchestrator gate; runCycle will consult this
  in T9). Validated as string here, against runtime CyclePhase union
  at the registry layer (avoids circular import). `borrow_from` does
  NOT borrow phases — each pack declares explicitly.

  `calibration_domains: CalibrationDomain[]` declares per-pack
  scorecard buckets. Closed registry of algorithm `aggregator` values
  keeps SQL injection surface closed; open `name` strings let third-
  party packs add domains without a gbrain release (T3 codex
  refinement of D6).

  Backward compat: both fields default to []. Existing v0.38 manifests
  parse unchanged (pinned by 2 regression cases).

Tests:
  test/ingestion/migration-mode.test.ts (8 cases): mode type accepts
  literals, defaults to trickle, daemon branches correctly across
  trickle/migration/default-undefined, validation still runs in
  migration mode, mixed dual-source independence.

  test/schema-pack-manifest-v041.test.ts (19 cases): aggregator enum
  shape, phases default + accept + reject (non-string, empty, non-
  array), calibration_domains default + accept (single + multi entry,
  multi page_types), reject (unknown aggregator, kebab/uppercase/
  digit-start names, empty page_types, unknown extra field), v0.38
  back-compat regressions.

  All 27 cases pass first-green after API surface alignment.

Plan: ~/.claude/plans/system-instruction-you-are-working-toasty-milner.md
Tasks T2 + T3 of 13 in v0.41 lens packs + epistemology unification wave.
Unblocks: T4 (pack manifests reference both fields), T9 (cycle.ts gate
reads phases:), T10 (calibration widening reads calibration_domains).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(packs): 4 bundled lens pack manifests + registry wiring (v0.41 T4)

Authors gbrain-creator + gbrain-investor + gbrain-engineer +
gbrain-everything as bundled YAML manifests in
src/core/schema-pack/base/, registers them in the BUNDLED array in
load-active.ts, exports AGGREGATOR_KINDS + AggregatorKind +
CalibrationDomain types through the schema-pack barrel.

gbrain-creator: atom (NEW page type) + concept (reuse from base).
  phases: [extract_atoms, synthesize_concepts]. One calibration
  domain: concept_themes / cluster_summary / [concept]. Retires
  wintermute's atom-pipeline-coordinator cron (T12 follow-up).

gbrain-investor: thesis + bet_resolution_log (NEW). Borrows
  deal/person/company/yc from base. No new cycle phases (consumes
  existing extract_facts/propose_takes/grade_takes pipeline). Three
  calibration domains: deal_success/scalar_brier/[deal],
  founder_evaluation/scalar_brier/[person], market_call/weighted_brier
  /[thesis]. Filing rules mirror wintermute's existing investing/deals
  + investing/theses + investing/bets layout.

gbrain-engineer: bridge-only per D8-C. ONLY declares `learning`
  page type (primitive: annotation); borrows code+project from base.
  No new cycle phases (gstack-learnings IngestionSource is daemon-
  side per T8). Three calibration domains: architecture_calls/
  scalar_brier/[code, learning], effort_estimates/weighted_brier/
  [project], risk_assessment/scalar_brier/[project].

gbrain-everything: meta-pack extending gbrain-investor + borrowing
  atom (from creator) + learning (from engineer). Codex outside-voice
  T4 resolution to the multi-lens problem: composes via the v0.38-
  shipped extends + borrow_from chain instead of inventing an
  active-multi-pack architecture. Single-active-pack constraint
  preserved. Explicitly re-declares phases + calibration_domains
  (borrow_from borrows types/link_types only — phases must be
  declared per pack per D4-B).

Frontmatter validators (atom_type closed 11-value enum, virality_
score range, etc.) are NOT declared in these manifests — that
contract surface (per-page-type frontmatter_validators on
PageTypeSchema) is a v0.42 follow-up filed in plan TODOs. For
v0.41, extract_atoms hardcodes the enum with a TODO comment
pointing at the eventual manifest read path (D11).

YAML parser caveat: src/core/schema-pack/loader.ts uses a hand-
rolled parseYamlMini (per loader.ts:86 explicit non-support of `|`
block scalars). Initial descriptions used `|` blocks and broke
parsing silently (description was 'literal "|"', everything after
collapsed). Reauthored to single-line "..." strings. Pinned by
the manifest-load tests asserting page_types/phases/calibration_
domains all resolve.

Tests:
  test/lens-pack-manifests.test.ts (31 cases): one file covers all
  4 packs to avoid 4x boilerplate. Pins parse cleanly, registry
  inclusion, per-pack page_types/phases/calibration_domains/filing_
  rules shape, every aggregator value falls in AGGREGATOR_KINDS,
  meta-pack unions correctly (7 calibration domains across all
  three lens packs).

Plan: ~/.claude/plans/system-instruction-you-are-working-toasty-milner.md
Task T4 of 13. Unblocks T5/T6 (phases now declared; phases read
from active pack at runtime), T7 (importer writes atom-typed
pages against creator manifest), T8 (gstack-learnings emits
learning-typed pages against engineer manifest), T9 (orchestrator
gate reads phases: declaration), T10 (calibration_profile walks
calibration_domains).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(cycle): orchestrator-level pack gate for lens-pack phases (v0.41 T9)

Wires extract_atoms + synthesize_concepts into runCycle with the D4-B
orchestrator-level pack gate. Five surgical edits to src/core/cycle.ts:

  1. CyclePhase union grows by 2 names.
  2. ALL_PHASES inserts extract_atoms after extract_facts (Haiku 3-check
     has fresh fact context, BEFORE resolve_symbol_edges to avoid
     interrupting the symbol resolution sweep mid-flight) and
     synthesize_concepts after patterns (cluster pass sees fresh
     cross-session themes).
  3. PHASE_SCOPE entries: extract_atoms='source' (per-source transcript
     walk), synthesize_concepts='global' (concept clusters cross sources
     by nature).
  4. NEEDS_LOCK_PHASES adds both (put_page writes mutate DB).
  5. runCycle dispatch blocks for both phases consult packDeclaresPhase
     before invoking. When the active pack doesn't declare the phase,
     skipped with reason='not_in_active_pack' marker. When it does,
     lazy-imports extract-atoms.ts / synthesize-concepts.ts and runs.

The packDeclaresPhase helper is new at module-private scope. Loads the
active pack via loadActivePack({cfg, remote:false}); reads
resolved.manifest.phases (local only — D4-B). Fail-open: any registry
error (pack not found, malformed manifest) returns false. Skipping >
crashing for an orchestrator gate.

Local-only phase semantics (not extends-chain inherited) preserves user
sovereignty: a downstream pack extending gbrain-creator may NOT want
extract_atoms to run (e.g. derives atoms differently). Inheriting phases
would force them into a no-op-or-fork choice. The gbrain-everything
meta-pack therefore RE-DECLARES creator's phases verbatim in its own
manifest, asserted by the T4 test.

Stub phase modules ship in this commit:
  src/core/cycle/extract-atoms.ts → returns skipped with reason=
    'stub_pending_t5'
  src/core/cycle/synthesize-concepts.ts → returns skipped with reason=
    'stub_pending_t6'

T5/T6 replace the stub bodies with real LLM-driven phases. The
orchestrator dispatch is fully wired today and exercised by the test.

Manifest schema follow-on: phases + calibration_domains were originally
.default([]) but the type narrowing broke v0.38 fixture casts in
test/schema-pack-{lint-rules,registry,registry-reload}.test.ts.
Reverted to .optional(); consumers apply `?? []` at the read site.
Same pattern as IngestionSource.mode in T2. Updated T3 + T4 tests
to use `!` non-null assertion at sites that explicitly declared the
fields (typechecker can't narrow array literals through optional
boundaries).

Tests:
  test/cycle-pack-gating.test.ts (19 cases, R-GATE IRON RULE):
  ALL_PHASES + PHASE_SCOPE shape, ordering invariants (extract_atoms
  after extract_facts, synthesize_concepts after patterns), exhaustive
  PHASE_SCOPE map, NEEDS_LOCK_PHASES static-source assertion (both new
  phases included), dispatch consults packDeclaresPhase for BOTH new
  phases (and ONLY those two), packDeclaresPhase helper exists +
  reads manifest.phases (not merged chain) + fail-open returns false
  on catch, pre-existing 17 phases NEVER consult packDeclaresPhase
  (extract_facts + calibration_profile spot-checked), not_in_active_pack
  reason marker appears exactly 2x (semantic consistency across
  both gated phases).

  Adjacent test fixes: T3 + T4 tests updated for optional-field
  semantics. T2 dispatch type narrowed to DispatchOutcome shape from
  daemon.ts ({kind: 'queued'} for success path).

89/89 across T1+T2+T3+T4+T9 tests pass; typecheck clean.

Plan: ~/.claude/plans/system-instruction-you-are-working-toasty-milner.md
Task T9 of 13. Unblocks: T5 (extract-atoms.ts body replaces stub),
T6 (synthesize-concepts.ts body replaces stub).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(calibration): domain_scorecards widening + 4 aggregators (v0.41 T10)

Replaces the v0.36.1.0 placeholder `JSON.stringify({})` in
calibration-profile.ts:336 with a real aggregator pass over the active
pack's calibration_domains declarations. domain_scorecards JSONB now
populates per declared domain with {n, brier, accuracy, aggregator,
page_types, extras}.

New module: src/core/calibration/domain-aggregators.ts
  - aggregateDomainScorecards(engine, holder, domains, sourceId) → JSONB-shape
  - 4 aggregator implementations matching the AggregatorKind closed enum:
    - scalar_brier: AVG(POWER(weight - outcome::int, 2)). The default for
      most predictive domains. Filters by holder + page_types +
      resolved_outcome IS NOT NULL + active=TRUE + source_id.
    - weighted_brier: Brier weighted by ABS(weight - 0.5) * 2 (conviction
      proxy since takes table has no separate confidence column). A
      0.95-conviction miss weights 9x more than a 0.55-conviction one.
      Matches the investor pack's market_call semantics.
    - count_based: simple SUM(hit)/COUNT(*) accuracy without Brier.
      For domains where probability isn't natural.
    - cluster_summary: page count + tier histogram via
      frontmatter->>'tier' JSONB read. For concept_themes where there's
      no binary outcome to score. Returns {n, tier_counts: {T1, T2,
      T3, T4}}.

Wiring in src/core/cycle/calibration-profile.ts:
  Try/catch wraps the loadActivePack → aggregator chain. Empty {}
  scorecard on any pack-resolution error (R1 IRON RULE: byte-identical
  v0.36.1.0 baseline when no active pack declares domains). Warning
  appended to result.warnings so doctor surfaces silent failures
  instead of crashing the phase.

Per-domain fail-soft: aggregateOneDomain's try/catch returns
{n: 0, brier: null, accuracy: null, extras: {error}} for any single
malformed domain. The other domains still aggregate. Phase keeps
running.

Tests (test/domain-aggregators.test.ts, 13 cases):
  - R1 IRON RULE: empty domain list returns {} (byte-identical)
  - scalar_brier: empty no-takes returns n:0/null/null; 2-take
    Brier computed correctly (0.5 over (0, 1) sq_errs); accuracy
    matches weight>=0.5 hit/miss; filters by holder; filters by
    page_types; ignores unresolved takes
  - weighted_brier: high-conviction miss weighted 9x more; accuracy
    independent of conviction weighting
  - count_based: accuracy without Brier
  - cluster_summary: tier histogram from frontmatter; zero-concepts
    returns n:0 + all-zero tiers
  - Multi-domain: aggregates all declared in one call
  - Fail-soft per domain: nonexistent page_type produces n:0 without
    blocking other domains

89/89 across T1+T2+T3+T4+T9+T10 tests; typecheck clean.

Plan: ~/.claude/plans/system-instruction-you-are-working-toasty-milner.md
Task T10 of 13. The propose_takes-side wiring (populate
take_domain_assignments at write time from active pack's page_type→
domain mapping) is deferred to T5/T6 phase implementations, since
they are the natural producers of takes. Manual propose_takes via
fence write covers the operator path. v0.42+ adds a takes-fence
parser extension to read domain[] from fence rows.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(ingestion): gstack-learnings bridge source (v0.41 T8)

Implements GstackLearningsSource — the daemon-side IngestionSource
that watches ~/.gstack/projects/{repo}/learnings.jsonl and emits
each new line as a `learning`-typed IngestionEvent.

Closes the v0.40-and-earlier gap where gstack's typed engineering
knowledge base (7 learning types: pattern, pitfall, preference,
architecture, tool, operational, investigation) lived in JSONL files
the brain never queried. After T8 + the engineer-pack manifest
activation, every gstack-logged learning surfaces as a first-class
gbrain page within seconds of being written.

Lifecycle:
  - constructor: discovers JSONL files via ~/.gstack/projects/*&garrytan#47;
    learnings.jsonl (cross-project mode, default) or just the current
    project (per-project mode). Test seam: _readFile/_existsSync/_skipWatch.
  - start(ctx): seeds seenLines with content_hashes of EVERY existing
    line so first-run-after-install does NOT replay thousands of
    historical lines as fresh emits. Then installs fs.watch handlers
    (one per discovered file) that fire rescanFile on 'change'.
  - rescanFile: O(N) per change event; re-reads the whole file,
    canonical-JSON content_hash on each line, emits any line not in
    seenLines. Malformed JSONL lines skip+warn.
  - stop(): closes all watchers; JSONL state preserved (gstack owns
    the files, gbrain only reads).
  - healthCheck(): reports warn when no files discovered (gstack not
    installed) OR when watched files have disappeared; ok otherwise
    with counter of lines seen.

mode: 'trickle' (the v0.41 T2 default). Line-level content_hash via
canonical-JSON serialization means whitespace reformatting doesn't
trigger re-emit. Re-emit of an identical line is a silent dedup hit
via the daemon's 24h DedupWindow (T2 trickle path).

Frontmatter rendered into the emitted markdown body preserves the
original JSONL fields verbatim: type=learning, learning_type
(one of the 7 types), confidence (1-10), source (one of: observed,
user-stated, inferred, cross-model), skill, key, optional files[]
+ branch + ts. Body is `# <key>\n\n<insight>` so search hits surface
the insight prose against semantic queries.

Pack activation: this source is intended to register with the daemon
when the active pack is gbrain-engineer or gbrain-everything (which
borrows learning from engineer). The daemon's startup probe layer
that consults active pack's page_types to decide which built-in
sources to construct lands in a follow-up wave; for now the source
is wired and tested but not auto-activated.

Tests (test/ingestion/gstack-learnings.test.ts, 14 cases):
  - Basic contract: mode='trickle', id includes pid, kind='gstack-learnings'
  - Start seeds seenLines (historical lines NOT replayed)
  - Malformed JSONL lines skip without crashing
  - Blank lines + trailing newlines OK
  - emitLine: new line emits, identical line is silent dedup hit
  - Emitted body carries proper frontmatter (type, learning_type,
    confidence, source, skill, key, files, branch, ts)
  - Canonical-JSON content_hash dedup (whitespace reformat = hit)
  - healthCheck warn/ok states
  - describePaths diagnostic per-file existence + size

All 14 pass; typecheck clean.

Plan: ~/.claude/plans/system-instruction-you-are-working-toasty-milner.md
Task T8 of 13.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(ingestion): wintermute-greenfield migration-mode importer (v0.41 T7)

Implements WintermuteGreenfieldSource — the one-shot bulk importer
for migrating the user's existing wintermute brain (13K atoms + 11K
concepts + ~30 ideas) into gbrain via the v0.41 lens packs.

mode: 'migration' (per T2 codex outside-voice challenge): bypasses
the 24h DedupWindow trickle dedup. Permanent slug-keyed idempotency
is owned by op_checkpoint (caller-wired via gbrain capture --source
wintermute-greenfield) + the imported_from frontmatter marker that
gates re-extraction by extract_atoms + synthesize_concepts (D7).

@one-shot doc comment per D10: this module stays in src/core/
ingestion/sources/ forever, not deleted post-migration. Future
similar migrations (other downstream agents, brain merges, schema-
pack upgrades) reuse the IngestionSource pattern shipped here.
Deleting the working example is short-sighted.

Walk:
  - ~/git/brain/atoms/{YYYY-MM-DD}/*.md (atoms, date-bucketed)
  - ~/git/brain/concepts/*.md (concepts, flat)
  - ~/git/brain/ideas/*.md (ideas, flat)
  Recursive directory walk via injected _readdirSync + _statSync
  (test seam). Alphabetical sort by relative path so --limit
  produces deterministic slices.

Per file:
  1. Read content; gray-matter parses frontmatter + body
  2. Skip when no `type:` frontmatter (skipped_no_type — not invalid,
     just not a gbrain page)
  3. Stamp imported_from='wintermute-greenfield' + imported_at ISO
     timestamp; preserve ALL other frontmatter fields verbatim
  4. Re-stringify via matter.stringify
  5. Emit IngestionEvent with content_type='text/markdown',
     untrusted_payload=false (local user-owned files), metadata
     carrying slug + page_type + original_path + original_frontmatter
     + importer + importer_version

Per-row validation failure → JSONL audit at
~/.gbrain/audit/wintermute-greenfield-failures-YYYY-Www.jsonl per
D12. Failed-file processing continues (don't fail-fast on one bad
row). Audit dir created lazily via mkdirSync recursive on first
write.

CLI flags supported via opts:
  --dry-run: walks + validates + stamps but doesn't emit
  --limit N: processes only the first N files (alphabetical)

The CLI surface lands via gbrain capture --source wintermute-greenfield
in a follow-up commit (capture.ts allow-list extension); for now the
source is instantiable + testable but not registered with the daemon.

Tests (test/ingestion/wintermute-greenfield.test.ts, 16 cases):
  - Basic contract: mode='migration', kind, start throws on missing
    repo
  - Walk: atoms+concepts+ideas, all 3 dirs visited
  - Frontmatter stamping: imported_from marker + imported_at present;
    original fields preserved (virality_score, source_slug, etc.)
  - Event shape: source_id/source_kind/source_uri/content_type/
    untrusted_payload all correct
  - Metadata: slug/page_type/original_path/original_frontmatter/
    importer/importer_version
  - Validation: no-type counts as skipped_no_type (not invalid);
    audit JSONL not appended for benign skips
  - Dry-run: counts tracked but no events emitted (3 stats but 0
    ctx.emitted)
  - --limit: only N files processed
  - Deterministic ordering: alphabetical relative-path sort means
    --limit 1 always picks the alphabetically-first file
  - healthCheck: ok after clean run; warn before start

All 16 pass; typecheck clean.

Plan: ~/.claude/plans/system-instruction-you-are-working-toasty-milner.md
Task T7 of 13.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(cycle): extract_atoms + synthesize_concepts minimal-viable bodies (v0.41 T5+T6)

Replaces the T9-shipped stub modules with working LLM-driven phase
bodies. v0.41 ships the right SHAPE — Haiku per transcript producing
1-3 atoms, atoms grouped by concept frontmatter ref, tier assignment
by count, Sonnet narrative for T1/T2. The richer 3-check quality gate
(truism/punchline/entity multi-pass), embedding-similarity dedup, voice
gate integration, op_checkpoint resumability all land in v0.41.1+ —
filed as inline TODOs and plan follow-ups.

T5 extract_atoms (src/core/cycle/extract-atoms.ts):
  - Takes transcripts via _transcripts test seam OR discoverTranscripts
    production path (lazy-imports transcript-discovery.ts to avoid
    circular module loads through cycle.ts).
  - Per transcript: ONE Haiku call with the 11-value atom_type enum
    embedded in the prompt (matches gbrain-creator.yaml declaration;
    v0.42 reads from active pack manifest at runtime per D11).
  - parseAtomsResponse tolerates markdown fences + trailing prose;
    rejects invalid atom_type values; clamps virality_score to [0,100];
    rejects malformed entries silently (skip don't crash).
  - Per atom: putPage atom-typed page under atoms/{YYYY-MM-DD}/
    {slug-from-title}. Frontmatter preserves atom_type, source_quote,
    lesson, virality_score, emotional_register from the LLM output.
  - Budget cap $0.30/source/run (DEFAULT_BUDGET_USD); over-budget
    transcripts counted as budget-skipped, phase returns status='warn'
    if any failures occurred.
  - Source-scoped: opts.sourceId routes corpus dir + write target.
  - dry-run: counts but doesn't writePages.
  - Failures tracked per-transcript without halting the run.

T6 synthesize_concepts (src/core/cycle/synthesize-concepts.ts):
  - Takes atoms via _atoms test seam OR DB query for type='atom' pages
    excluding imported_from frontmatter marker (D7 skip).
  - Groups atoms by frontmatter `concepts:` array ref.
  - Tier by count: T1 >=10, T2 >=5, T3 >=2, T4 deferred (no <2 groups).
  - T1/T2 groups: Sonnet call with up to 10 sample titles + 5 sample
    bodies → 1-paragraph narrative. Budget cap $1.50/run; over-budget
    or LLM-failed groups fall back to deterministic narrative.
  - T3 groups: deterministic narrative (no LLM call).
  - Per group: putPage concept-typed page at concepts/{title-from-slug}
    with tier + mention_count + composite_score frontmatter.
  - dry-run + yieldDuringPhase honored.

Tests (test/cycle/extract-atoms-synthesize-concepts.test.ts, 19 cases):
  parseAtomsResponse: well-formed JSON, markdown fences stripped,
  trailing prose tolerated, invalid atom_type rejected, missing fields
  rejected, garbage returns [], all 11 atom_type values accepted,
  virality_score clamped to [0,100].

  runPhaseExtractAtoms: no-op without transcripts, extracts via stub
  chat + writes pages, dry-run counts without writing, failures
  tracked per-transcript without halting.

  runPhaseSynthesizeConcepts: no-op without atoms, groups by concept
  ref + tier assignment by count (T1=12 atoms, T2=6, T3=3), atoms
  without concept refs filtered out, <T3 threshold (1 atom) filtered,
  T3 uses deterministic (no LLM call), dry-run counts without writing,
  T1 narrative comes from LLM stub verbatim.

All 19 pass; typecheck clean.

Plan: ~/.claude/plans/system-instruction-you-are-working-toasty-milner.md
Tasks T5 + T6 of 13. v0.41.1 follow-ups inline:
  - extract_atoms: read atom_type enum from active pack at runtime (D11)
  - extract_atoms: 3-check quality gate as multi-pass refinement
  - synthesize_concepts: embedding-similarity dedup (currently exact-
    string concept ref match only)
  - synthesize_concepts: voice gate for T1 Canon narratives
  - Both: op_checkpoint resumability for cross-cycle continuation

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(v0.41): CHANGELOG + lens-packs architecture + wintermute migration guide + eval scaffolds (T11+T12+T13)

Closes out the v0.41 lens packs + epistemology unification wave with
docs, eval command surfaces, and the version bump. Three tasks batched
because each is small standalone:

T11 — 3 eval command scaffolds:
  src/commands/eval-extract-atoms.ts
  src/commands/eval-synthesize-concepts.ts
  src/commands/eval-wintermute-greenfield.ts

  Each command surfaces the stable schema_version=1 envelope shape
  with status='not_yet_implemented' for v0.41. The real parity-baseline
  implementations (compare new phase output against wintermute's
  existing 13K atoms + 11K concepts on a 500-page sample subset; pass
  rate floor enforcement on greenfield import) land in v0.41.1. The
  scaffolds let users discover the commands AND give the v0.41.1 work
  a clear extension point. Pinned by 7 scaffold tests.

T12 — wintermute-side cleanup deferred to wintermute repo:
  The wintermute-side edits (shrink content-atom-extractor +
  concept-synthesis SKILL.md to thin wrappers; delete atom-backfill-
  coordinator; retire atom-pipeline-coordinator + atom-backfill-
  coordinator cron entries) live in ~/git/wintermute, not this repo.
  The migration guide (docs/migrations/v0.41-wintermute-greenfield.md
  below) documents the cleanup steps. Operator runs them after
  verifying the greenfield import.

T13 — Documentation:
  CHANGELOG.md: full v0.41.0.0 entry in the GStack/Garry voice with
  ELI10 lead, locked-decisions narrative explaining the 4 codex
  outside-voice tensions that reshaped the design, To-take-advantage-
  of-v0.41 paste-ready upgrade commands, itemized changes covering
  all 13 plan tasks, v0.41.1 follow-ups list.

  docs/architecture/lens-packs.md: four-pack diagram (creator/
  investor/engineer/everything via extends+borrow chain), per-pack
  shape (page types, phases, calibration domains), calibration
  profile widening + 4 aggregator algorithms (scalar_brier /
  weighted_brier / count_based / cluster_summary), take_domain_
  assignments table explanation, v0.41.1 follow-ups.

  docs/migrations/v0.41-wintermute-greenfield.md: operator guide
  for the bulk 24K-page migration. Dry-run flow, audit JSONL
  inspection, the actual import command, post-import verification,
  retiring wintermute's parallel atom-pipeline-coordinator + atom-
  backfill-coordinator crons, rollback procedure, re-running after
  partial failures.

Version bump: VERSION + package.json → 0.41.0.0.

All 158 tests across 10 v0.41 test files pass; typecheck clean.

Plan: ~/.claude/plans/system-instruction-you-are-working-toasty-milner.md
Final tasks T11 + T12 + T13 of 13. Wave shipped end-to-end across
11 commits on this branch:
  9e17d00  T1: migration v93 take_domain_assignments
  f4b2648  T2+T3: IngestionSource.mode + manifest schema extensions
  cefaad3  T4: 4 bundled lens pack manifests
  1850613  T9: cycle.ts orchestrator-level pack gate
  c6f3349  T10: calibration_profile widening + 4 aggregators
  d1964ef  T8: gstack-learnings bridge source
  adcaf4a  T7: wintermute-greenfield migration-mode importer
  0318229  T5+T6: extract_atoms + synthesize_concepts bodies
  (this)    T11+T12+T13: eval scaffolds + docs + version bump

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(tests): bump phase-count assertions from 17→19 (v0.41 follow-on)

v0.41 added extract_atoms + synthesize_concepts to ALL_PHASES.
Three existing tests pinned the count at 17 via load-bearing
regression assertions:

  test/phase-scope-coverage.test.ts:48-49
    expect(ALL_PHASES.length).toBe(17)
    expect(Object.keys(PHASE_SCOPE).length).toBe(17)

  test/core/cycle.serial.test.ts:393
    expect(hookCalls).toBe(17)  // yieldBetweenPhases hook fires per phase

  test/core/cycle.serial.test.ts:406
    expect(report.phases.length).toBe(17)

  test/e2e/cycle.test.ts:110
    expect(report.phases.length).toBe(17)

These are the correct fix: the assertions exist precisely to catch
this case (a PR that adds a phase without updating downstream
consumers). The wave's v0.41 commit (T9) updated ALL_PHASES but
missed these three sites. Updating them to 19 with comment
breadcrumbs preserving the version history (v0.26.5 → 9,
v0.29 → 10, v0.31 → 11, v0.32.2 → 12, v0.33.3 → 13,
v0.36.1.0 → 16, v0.39.0.0 → 17, v0.41.0.0 → 19).

Without this fix: full unit test suite (`bun run test`) shows 3
failures from these assertions. Underlying v0.41 logic was already
green; this is pure pin-bumping.

After fix: 9059 unit tests pass. 0 actual test failures. (3 shard
wedges remain from unrelated long-running parallel-runner tests
that exceed the 600s per-shard cap — infra concern, not test
logic, pre-dates this wave.)

Plan: ~/.claude/plans/system-instruction-you-are-working-toasty-milner.md
Wave gate: all 13 plan tasks done; all v0.41 tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(e2e): update EXPECTED_PHASES for v0.41 (extract_atoms + synthesize_concepts + schema-suggest)

E2E test/e2e/dream-cycle-phase-order-pglite.test.ts pinned the canonical
phase sequence at 16 entries. v0.41 added extract_atoms (after
extract_facts) and synthesize_concepts (after patterns); v0.39 had
already added schema-suggest between orphans and purge. EXPECTED_PHASES
was missing all three.

This is the correct fix — the test exists specifically to catch a PR
that adds a phase without updating consumers, and it fired exactly as
designed. Updating EXPECTED_PHASES to the v0.41 19-phase sequence with
comment breadcrumbs (v0.39.0.0 schema-suggest, v0.41.0.0 extract_atoms
+ synthesize_concepts).

Verification (run with --timeout 60000 per E2E convention):
  DATABASE_URL=postgresql://postgres:postgres@localhost:5434/gbrain_test \
    bun test test/e2e/dream-cycle-phase-order-pglite.test.ts --timeout 60000
  → 5 pass, 0 fail

Other E2E failures observed in the full run are pre-existing /
environmental and not v0.41 regressions:
  - dream-synthesize-chunking: existing flake (synthesize details
    shape under withoutAnthropicKey)
  - fresh-install-pglite: env has multiple embedding providers
    configured; requires explicit --embedding-model disambiguation
  - http-transport: last_used_at debounce timing flake
  - ingestion-roundtrip: file-watcher trickle-mode timing flake
  - mechanical: gbrain doctor exits 1 because user's persistent
    ~/.gbrain has wedged migrations + reranker auth warnings
  - autopilot-fanout-postgres: pre-existing dispatch-selector
    timestamp semantics

None of those 6 are touched by the v0.41 wave. Filing them as
unrelated maintenance items.

Plan: ~/.claude/plans/system-instruction-you-are-working-toasty-milner.md
Wave gate: 13 plan tasks done; v0.41 unit tests green; v0.41 E2E
green; pre-existing E2E flakes unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(e2e): 4 root-cause fixes for pre-existing E2E flakes (master polish)

After merging origin/master (which landed v0.40.8.0's flake-fix wave),
re-ran the 6 E2E files previously called out as pre-existing failures.
v0.40.8.0 had already fixed 3; the remaining 3 had real root causes:

1. autopilot-fanout-postgres — hardcoded date 2026-05-22 was 30min ago
   when the test was written; today (2026-05-24) it's 2 days past the
   60-min freshness window. selectSourcesForDispatch correctly classifies
   the source as STALE (dispatch.length=1) instead of FRESH (length=0).
   Fix: replace literal date with Date.now() - 30 * 60 * 1000 so the
   timestamp stays relative-fresh forever.

2. ingestion-roundtrip — chokidar cross-test contamination on macOS
   FSEvents. Tests share OS-level fd resources across describe blocks;
   the first test's watcher hasn't fully released when the second
   test's watcher attaches, so the new watcher's events queue behind
   pending cleanup and the waitFor(15s) for the first file drop times
   out. Fixes:
     - Move fs.mkdirSync(inboxDir) BEFORE createInboxFolderSource +
       daemon.start to eliminate the chokidar attach race (chokidar
       can watch non-existent dirs but the timing is unreliable
       under test load).
     - Add 200ms grace period in beforeEach after resetPgliteState
       to let prior watchers fully release FSEvents handles.
     - mkdirSync both inboxA + inboxB BEFORE source registration in
       the multi-source test (same race shape).
     - Bump waitFor timeouts 6s → 15s for fs.watch flake tolerance.

3. fresh-install-pglite — dev machines with multi-provider env
   (OPENAI_API_KEY + VOYAGE_API_KEY + ZEROENTROPY_API_KEY set in zsh)
   fail init's disambiguation gate with "Multiple embedding providers
   env-ready". The test sets ZE_API_KEY but doesn't NEGATE the others.
   Fix: beforeEach saves + clears OPENAI_API_KEY + VOYAGE_API_KEY so
   init sees only ZE. afterEach restores. Hermetic per dev machine.

4. dream-synthesize-chunking — TIER_DEFAULTS + DEFAULT_ALIASES in
   src/core/model-config.ts had BARE Anthropic model ids (e.g.
   'claude-sonnet-4-6' instead of 'anthropic:claude-sonnet-4-6'). The
   v0.40.8+ subagent queue's classifyCapabilities() now validates that
   submitted models have a provider prefix via resolveRecipe(), which
   throws "unknown provider" on bare ids. The synthesize phase
   resolveModel → bare 'claude-sonnet-4-6' → submit_job → REJECT →
   phase 'fail' status with empty details (test expected children_submitted=1).
   Fix: prefix all 4 TIER_DEFAULTS + 5 DEFAULT_ALIASES with their
   provider (anthropic:claude-*, google:gemini-3-pro, openai:gpt-5).
   Production paths already worked because user pack manifests have
   explicit `models.tier.subagent = anthropic:...`; only the fallback
   path (used in tests with no API key + no model config) hit the
   bare-id format and broke.

Verification (all run against DATABASE_URL=...:5434/gbrain_test):
  test/e2e/autopilot-fanout-postgres.test.ts → 6/6 pass
  test/e2e/dream-cycle-phase-order-pglite.test.ts → 5/5 pass
  test/e2e/dream-synthesize-chunking.test.ts → 4/4 pass
  test/e2e/fresh-install-pglite.test.ts → 2/2 pass
  test/e2e/http-transport.test.ts → 8/8 pass
  test/e2e/ingestion-roundtrip.test.ts → 3/3 pass
  test/e2e/mechanical.test.ts → 78/78 pass
  Total: 106/106 pass, 0 fail.

Adjacent unit tests verified green:
  test/anthropic-model-ids.test.ts → 6/6 pass
  test/model-config.serial.test.ts → 19/19 pass

typecheck clean.

Plan: v0.41 wave (~/.claude/plans/system-instruction-you-are-working-toasty-milner.md).
Post-merge polish — every E2E failure surfaced in the v0.41 ship reports is now green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(v0.42.0.0): privacy sweep + queue rebump + 5 pre-existing test fixes

Privacy: rename `wintermute-greenfield` → `markdown-greenfield` identifier
across 13 files + 4 file renames per CLAUDE.md:550 (banned private-fork name
in public artifacts). Identifier shipped through the lens-pack wave as the
long-lived migration-mode source kind; sweep includes class names
(MarkdownGreenfieldSource), frontmatter marker, audit JSONL path, eval
command, and operator doc filename. Reframe contextual mentions per
OpenClaw substitution rule ("your OpenClaw"/"upstream OpenClaw").

Queue: rebump v0.41.0.0 → v0.42.0.0 (PR garrytan#1352 claims v0.41.0.0 in queue);
sweeps 38 v0.41 → v0.42 references across branch-introduced files; renames
docs/migrations/v0.41-markdown-greenfield.md → v0.42-markdown-greenfield.md,
test/schema-pack-manifest-v041.test.ts → -v042, test/eval-v041-scaffolds →
test/eval-v042-scaffolds. Pre-existing master files referencing v0.41 left
untouched (those describe master's own anticipated wave).

Test fixes (5 pre-existing failures + 1 shard wedge, all unrelated to lens
packs but caught by the post-merge run):
- src/core/anthropic-pricing.ts: estimateMaxCostUsd strips `anthropic:`
  provider prefix before ANTHROPIC_PRICING lookup. v0.31.12 introduced
  provider-prefixed model strings; the budget meter wasn't updated and
  fell through to BUDGET_METER_NO_PRICING (budget gate disabled), letting
  auto-think submissions complete when the test expected budget exhaustion
  to force partial/skipped.
- test/longmemeval-trajectory-routing.test.ts: perf-gate cap 10s → 30s.
  Test runs ~4s isolated; parallel-shard CPU contention pushes it to 16s.
  30s still catches genuine cold-path regressions.
- test/search/embedding-column.test.ts → .serial.test.ts: quarantine to
  serial pass (depends on gateway module-state set by bunfig.toml preload;
  other parallel tests' resetGateway() leaves stale state).
- scripts/run-unit-parallel.sh: SHARD_TIMEOUT 600s → 900s. Shard 8's
  migration test suite runs 1369 tests in 807s (all pass); 600s wrapper
  cap was killing healthy shards.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* docs: update project documentation for v0.42.0.0

Sweep v0.41 → v0.42.0.0 drift across the wave's release-summary and the
two new doc files. The wave shipped under its planning-time name (v0.41);
the queue rebump to v0.42.0.0 left a handful of factual references
pointing at the wrong version.

- CHANGELOG.md v0.42.0.0 entry: doc-ref filename, follow-up version
  label, and 4 in-prose v0.41 cites corrected to v0.42.0.0 / v0.42.0.1.
- docs/architecture/lens-packs.md: title + body + follow-up section
  corrected to v0.42.0.0 / v0.42.0.1.
- docs/migrations/v0.42-markdown-greenfield.md: title + upgrade
  command text corrected to v0.42.0.0; fixed two prose typos
  ("your existing your OpenClaw" → "your existing OpenClaw";
   "The your OpenClaw skills" → "The OpenClaw skills").

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* chore: rebump v0.42.0.0 → v0.41.2.0 (per user; patch slot on v0.41 line)

PRs garrytan#1352 and garrytan#1367 both claim v0.41.0.0 in queue (the .0 slot is contested);
v0.41.2.0 is unclaimed and represents this wave as a PATCH on the v0.41 line
rather than a separate minor wave.

Sweeps v0.42.0.0 → v0.41.2.0 across CHANGELOG + 2 docs + 4 yaml + 4 ts + 2
test files; renames docs/migrations/v0.42-markdown-greenfield.md →
v0.41.2-markdown-greenfield.md and 2 test files (-v042 → -v041_2).

Wave-identity tags ("v0.41 T4" etc) in test/code comments correctly
preserved — this IS a v0.41 wave patch, not a new wave. macOS sed `\b`
limitation means those tags were never converted in the first place;
verified intentional preservation.

Forward references to v0.42 in TODOS.md + CHANGELOG D3 section + future-
wave declarations in code comments are untouched (they describe the NEXT
minor wave, not this one).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(audit-writer): route log() to event-ts ISO-week file, not wall-clock now

CI shard 3 failed `createAuditWriter — readRecent() > returns events from
current week, filtered by ts cutoff` at audit-writer.test.ts:229 with
`Expected: 2, Received: 0`.

Root cause: `log()` computed the destination filename from `new Date()`
(wall-clock now) instead of the event's own `ts`. Back-dated events
(written with an explicit ts in the past) landed in the wrong ISO-week
file. `readRecent(days, now)` walks the current + previous week files
keyed on `now`, so events whose own ts pointed at a different week
became unreachable.

The test passes ts=2026-05-21/16/14 and now=2026-05-22 (week 21 + 20).
CI runs on wall-clock 2026-05-25 (week 22). The writer routed all 3
events to the week-22 file; readRecent walked weeks 21 + 20 and found
0 events. Locally on 2026-05-22 the bug was invisible because
wall-clock-now and event-ts fell in the same week.

Fix in src/core/audit/audit-writer.ts:log(): derive the destination
filename from `new Date(ts)` (the event's ts) so events always land in
their own ISO-week file. NaN-guard falls back to wall-clock-now on
unparseable ts.

Test update at test/audit/audit-writer.test.ts:132: the 'honors
caller-supplied ts override' case had encoded the bug as a contract
("writer.log writes to current-week file regardless of event ts").
Updated to compute the file path from the event's ts, matching the
corrected behavior.

All 22 audit-writer tests pass. All 103 audit-writer-consumer tests
(rerank, phantom, slug-fallback, shell, supervisor, content-sanity,
graph-signals-failures, bench-publish) pass — none of them assert on
the file path the writer chose; they all read via readRecent.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants