v0.23.2 fix(dream): orchestrator-stamped self-consumption marker + verdict model tests by garrytan · Pull Request #527 · garrytan/gbrain

garrytan · 2026-04-30T09:08:26Z

Summary

Replaces v0.23.1's content-prefix self-consumption guard with an orchestrator-stamped frontmatter marker. Codex review of the v0.23.1 plan caught two flaws: real serialized brain pages don't always contain their own slug in the body (so the prefix guard could miss real dream output), and real conversation transcripts often DO mention brain slugs (so the prefix guard dropped legitimate transcripts silently). v0.23.2 swaps content inference for explicit identity stamped at render time. Plus the v0.23.1 verdict-model config now has unit-test coverage for the marketed gbrain config set dream.synthesize.verdict_model ... contract.

Self-consumption guard, marker-based

synthesize.ts renderPageToMarkdown and writeSummaryPage now stamp dream_generated: true + dream_cycle_date into frontmatter at render time. Both functions exported for testing. The DB-stored frontmatter persists the marker across re-renders.
transcript-discovery.ts replaces DREAM_OUTPUT_SLUGS with DREAM_OUTPUT_MARKER_RE (anchored at frontmatter open with optional BOM, CRLF tolerant, scans first 2000 chars for dream_generated: true with word boundary). Stderr log fires when the guard skips a file — no more silent skips.

Explicit bypass flag (no implicit footgun)

New gbrain dream --unsafe-bypass-dream-guard CLI flag. Plumbed through runCycle.synthBypassDreamGuard → SynthesizePhaseOpts.bypassDreamGuard → discoverTranscripts({bypassGuard}) and readSingleTranscript({bypassGuard}). Loud stderr warning at phase entry when set. Never auto-applied for --input.

Verdict model test coverage

judgeSignificance and JudgeClient now exported. New tests assert client.create({ model }) receives the configured override and defaults correctly when omitted.

Test Coverage

Twelve new test cases in test/cycle-synthesize.test.ts:

Guard regression built from real Page → renderPageToMarkdown → isDreamOutput round-trip. Synthetic strings replaced with what synthesize.ts actually produces.
False positive prevention — user transcript citing wiki/personal/reflections/identity-foo is NOT skipped.
CRLF + BOM frontmatter tolerated.
Whitespace and case variants of dream_generated: true all match.
dream_generated: false and dream_generatedfoo: true (no word boundary) do NOT match.
Marker buried past 2000 chars does NOT trigger (perf bound).
bypassGuard=true overrides; discoverTranscripts respects it.
DREAM_OUTPUT_MARKER_RE is anchored at byte 0 (mid-content ---\n does not count).
judgeSignificance: verdict_model override threads to client.create, default fallback, unparseable response handling.

test/cycle-synthesize.test.ts: 32/32 pass. All five dream-related test files (cycle-synthesize, cycle-patterns, dream-synthesize-pglite, dream-patterns-pglite, dream-allow-list-pglite): 64/64 pass in 12.5s.

Tests: 23 → 32 (+9 marker-based) plus 3 new judgeSignificance cases.

Pre-Landing Review

/plan-eng-review ran during plan mode. Three issues identified, all bundled into this PR. Codex outside-voice review on the v0.23.2 plan caught five additional findings, ALL adopted into the corrected D4 architecture (orchestrator-stamped marker replaces content-pattern guard).

Cross-Model Consensus

Topic	Initial plan	Codex challenge	Resolution
D1 heuristic shape	frontmatter+slug heuristic	misses real output (no slug in body)	Adopted: explicit identity marker
D2 source of truth	derive from `_brain-filing-rules.json` globs	wrong surface — authorization vs identity	Adopted: identity intrinsic to the page
`--input` bypass	implicit on `--input`	footgun, cached verdict re-opens loop	Adopted: explicit `--unsafe-bypass-dream-guard`
Failure mode	JSON load at phase entry	no fail-closed contract	Adopted: guard self-contained, no external load
Tests	synthetic strings + alignment test	don't prove regression	Adopted: real Page → markdown round-trip

Plan Completion

All four decisions (D1+D2+D3+D4) bundled into v0.23.2 before merge — no deferred items.

Documentation

CLAUDE.md — Key files entries updated for the v0.23.2 marker-based self-consumption guard:
- src/core/cycle/synthesize.ts: v0.23.2 renderPageToMarkdown exporting + dream_generated/dream_cycle_date frontmatter stamping, summary-page stamping, exported judgeSignificance/JudgeClient, verdictModel parameter wiring.
- src/core/cycle/transcript-discovery.ts: v0.23.2 DREAM_OUTPUT_MARKER_RE + isDreamOutput() guard, bypassGuard option, stderr skip log, replaces v0.23.1's DREAM_OUTPUT_SLUGS content-prefix list.
- src/commands/dream.ts: v0.23.2 --unsafe-bypass-dream-guard flag + plumbing through runCycle.synthBypassDreamGuard.
CHANGELOG.md v0.23.2 entry, voice rewritten for the marker-based architecture.
TODOS.md, README.md, AGENTS.md, docs/: no drift, preserved as-is.

Note on version slot

Master shipped its own v0.23.1 (PR #528, local CI gate / 4-tier wall-time optimization) while this PR was in flight. Master's v0.23.1 is unrelated to dream-cycle work. This PR claims v0.23.2.

Test plan

bun test test/cycle-synthesize.test.ts — 32/32 pass
All dream-related tests — 64/64 pass across 5 files in 12.5s
bun run build — clean compile after master merge
CHANGELOG voice matches project style; v0.23.2 entry stacks above master's v0.23.1
Codex outside voice + Claude /plan-eng-review both ran; all findings adopted
Full bun run test:e2e against Postgres test DB (deferred — touched-file E2E suite green; full E2E covered by master's new bun run ci:local)

🤖 Generated with Claude Code

Built-in isDreamOutput() guard in transcript-discovery.ts auto-skips any transcript whose first 2000 chars contain dream output slug prefixes (wiki/personal/reflections/, wiki/originals/ideas/, wiki/personal/patterns/, dream-cycle-summaries/). Prevents infinite recursion if dream output is ever fed back into the corpus. judgeSignificance() now accepts a verdictModel parameter, loaded from dream.synthesize.verdict_model config key. Default: claude-haiku-4-5. 3 new test cases covering the guard.

…arker The v0.23.1 prefix-string guard had two flaws caught by codex review. serializeMarkdown does not embed the page slug into body content, so the heuristic could miss real dream output. And real conversation transcripts often cite brain slugs ("earlier I wrote about wiki/personal/reflections/identity..."), so the heuristic dropped legitimate transcripts silently. Swap content inference for explicit identity. renderPageToMarkdown and writeSummaryPage now stamp `dream_generated: true` + `dream_cycle_date` into frontmatter at render time. Guard checks for the marker via DREAM_OUTPUT_MARKER_RE (anchored at frontmatter open, BOM/CRLF tolerant, scans first 2000 chars, word boundary on `true`). Cannot drift, cannot false-positive on user text, cannot miss real output. Tests built from a real Page → renderPageToMarkdown → isDreamOutput round-trip (codex finding #5 — synthetic strings don't prove the guard catches what synthesize actually produces). Coverage: regression fixture, false-positive prevention on user transcripts citing slugs, CRLF+BOM, whitespace/case variants, anchor-at-byte-0, perf bound, bypass plumbing, dream_generatedfoo word-boundary check. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Explicit opt-in to disable the synthesize self-consumption guard. The flag is intentionally NOT tied to --input — codex review caught that implicit bypass is a footgun: any caller could synthesize a dream- generated page directly via --input, get a cached positive verdict, and silently re-trigger the loop bug. Plumbing: dream.ts CLI parses the flag → DreamArgs.bypassDreamGuard → runCycle({ synthBypassDreamGuard }) → SynthesizePhaseOpts.bypassDreamGuard → discoverTranscripts({ bypassGuard }) and readSingleTranscript. Loud stderr warning at phase entry when set so the cost is visible. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Replaces the v0.23.1 release notes with the v0.23.2 voice describing the orchestrator-stamped marker approach and the --unsafe-bypass-dream-guard flag. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

# Conflicts: # CHANGELOG.md # VERSION # package.json

Update CLAUDE.md Key Files entries for src/core/cycle/synthesize.ts, src/core/cycle/transcript-discovery.ts, and src/commands/dream.ts to reflect the v0.23.2 dream_generated frontmatter marker that replaces the v0.23.1 content-prefix self-consumption guard, plus the new --unsafe-bypass-dream-guard CLI flag. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

CI's `build-llms generator > committed match generator output` guard caught drift after the v0.23.2 doc-sync (commit 507edb1) updated three Key Files entries in CLAUDE.md without re-running `bun run build:llms`. The llms.txt index didn't drift (no new doc URLs); only the inlined llms-full.txt bundle needed refreshing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Three new PGLite E2E cases exercise the actual production loop scenario end-to-end. Unit tests covered the bug class at the function-pair level (renderPageToMarkdown → readSingleTranscript). These cover it at the phase level: runPhaseSynthesize with a real engine, real putPage, real renderPageToMarkdown, real corpus-dir discovery. 1. Leaked dream output is skipped on next synthesize run. The reflection page is inserted, reverse-rendered (which stamps `dream_generated: true`), dropped into the corpus dir as .txt, and the next phase run reports "no transcripts to process" with a stderr skip log. Verdict cache stays untouched so a future legit edit isn't shadowed by a stale cached "false". 2. bypassDreamGuard=true at phase entry re-enables ingestion. Same marked file gets discovered through the loud-warning path. Proves --unsafe-bypass-dream-guard plumbing reaches discoverTranscripts at phase scope. 3. Mixed corpus (leaked dream output + real conversation transcript) discovers exactly the real one. Pins codex finding #1's headline false-positive case: a transcript citing wiki/personal/reflections/ in body must NOT be skipped. Stderr capture via process.stderr.write spy with try/finally restore. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

CI typecheck caught three TS2322 violations in the round-trip E2E fixtures: 'reflection' is not a member of PageType. Reflections are filed as 'note' in production (renderPageToMarkdown falls back to 'note' for unknown types). No behavior change — the guard test still exercises the same serializeMarkdown → discoverTranscripts loop. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The pre-ship section listed `bun test` as the unit-test path but didn't flag the trap: `bun test` (the bun runner) does NOT run TypeScript type checking. Only `bun run test` (the npm script) does, because it chains `bun run typecheck` + the four shell pre-checks before the runner. CI on PR #527 caught a `'reflection'` literal that `PageType` doesn't admit (PageType is a closed union). The runtime E2E and `bun test` both passed locally because the runner doesn't gate on TS. The separate typecheck stage in CI rejected it. New rule: run `bun run typecheck` (or `bun run test`, which wraps it, or `bun run ci:local` for the full gate) before pushing. The runner- alone path is for hot-loop test iteration only. Also regenerated llms-full.txt for the CLAUDE.md update. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Master moved from v0.23.1 → v0.23.2 (PR #527: orchestrator-stamped self-consumption marker for the dream cycle + verdict-model unit tests). Conflicts resolved: - VERSION: kept this branch's 0.24.0 per CLAUDE.md branch-scoped rule. - package.json version: kept 0.24.0. - CHANGELOG.md: my v0.24.0 stays on top; master's new v0.23.2 entry spliced between v0.24.0 and v0.23.1. Sequence above v0.21.0 monotonic. Verification: - bun install — 0 new packages - All 5 CI guards green: privacy + jsonb + progress + trailing-newline + wasm - bun run typecheck — clean - This branch's tests: 103/103 pass Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Wintermute and others added 6 commits April 30, 2026 09:07

Merge remote-tracking branch 'origin/master' into garrytan/v0.23.2

2883d0f

# Conflicts: # CHANGELOG.md # VERSION # package.json

garrytan changed the title ~~v0.23.1 fix: dream self-consumption guard + configurable verdict model~~ v0.23.2 fix(dream): orchestrator-stamped self-consumption marker + verdict model tests May 1, 2026

garrytan and others added 4 commits April 30, 2026 22:52

garrytan merged commit 579722d into master May 1, 2026
7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.23.2 fix(dream): orchestrator-stamped self-consumption marker + verdict model tests#527

v0.23.2 fix(dream): orchestrator-stamped self-consumption marker + verdict model tests#527
garrytan merged 10 commits intomasterfrom
fix/dream-self-consumption-guard

garrytan commented Apr 30, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

garrytan commented Apr 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test Coverage

Pre-Landing Review

Cross-Model Consensus

Plan Completion

Documentation

Note on version slot

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

garrytan commented Apr 30, 2026 •

edited

Loading