Skip to content

v0.23.2 fix(dream): orchestrator-stamped self-consumption marker + verdict model tests#527

Merged
garrytan merged 10 commits intomasterfrom
fix/dream-self-consumption-guard
May 1, 2026
Merged

v0.23.2 fix(dream): orchestrator-stamped self-consumption marker + verdict model tests#527
garrytan merged 10 commits intomasterfrom
fix/dream-self-consumption-guard

Conversation

@garrytan
Copy link
Copy Markdown
Owner

@garrytan garrytan commented Apr 30, 2026

Summary

Replaces v0.23.1's content-prefix self-consumption guard with an orchestrator-stamped frontmatter marker. Codex review of the v0.23.1 plan caught two flaws: real serialized brain pages don't always contain their own slug in the body (so the prefix guard could miss real dream output), and real conversation transcripts often DO mention brain slugs (so the prefix guard dropped legitimate transcripts silently). v0.23.2 swaps content inference for explicit identity stamped at render time. Plus the v0.23.1 verdict-model config now has unit-test coverage for the marketed gbrain config set dream.synthesize.verdict_model ... contract.

Self-consumption guard, marker-based

  • synthesize.ts renderPageToMarkdown and writeSummaryPage now stamp dream_generated: true + dream_cycle_date into frontmatter at render time. Both functions exported for testing. The DB-stored frontmatter persists the marker across re-renders.
  • transcript-discovery.ts replaces DREAM_OUTPUT_SLUGS with DREAM_OUTPUT_MARKER_RE (anchored at frontmatter open with optional BOM, CRLF tolerant, scans first 2000 chars for dream_generated: true with word boundary). Stderr log fires when the guard skips a file — no more silent skips.

Explicit bypass flag (no implicit footgun)

  • New gbrain dream --unsafe-bypass-dream-guard CLI flag. Plumbed through runCycle.synthBypassDreamGuardSynthesizePhaseOpts.bypassDreamGuarddiscoverTranscripts({bypassGuard}) and readSingleTranscript({bypassGuard}). Loud stderr warning at phase entry when set. Never auto-applied for --input.

Verdict model test coverage

  • judgeSignificance and JudgeClient now exported. New tests assert client.create({ model }) receives the configured override and defaults correctly when omitted.

Test Coverage

Twelve new test cases in test/cycle-synthesize.test.ts:

  • Guard regression built from real Page → renderPageToMarkdown → isDreamOutput round-trip. Synthetic strings replaced with what synthesize.ts actually produces.
  • False positive prevention — user transcript citing wiki/personal/reflections/identity-foo is NOT skipped.
  • CRLF + BOM frontmatter tolerated.
  • Whitespace and case variants of dream_generated: true all match.
  • dream_generated: false and dream_generatedfoo: true (no word boundary) do NOT match.
  • Marker buried past 2000 chars does NOT trigger (perf bound).
  • bypassGuard=true overrides; discoverTranscripts respects it.
  • DREAM_OUTPUT_MARKER_RE is anchored at byte 0 (mid-content ---\n does not count).
  • judgeSignificance: verdict_model override threads to client.create, default fallback, unparseable response handling.

test/cycle-synthesize.test.ts: 32/32 pass. All five dream-related test files (cycle-synthesize, cycle-patterns, dream-synthesize-pglite, dream-patterns-pglite, dream-allow-list-pglite): 64/64 pass in 12.5s.

Tests: 23 → 32 (+9 marker-based) plus 3 new judgeSignificance cases.

Pre-Landing Review

/plan-eng-review ran during plan mode. Three issues identified, all bundled into this PR. Codex outside-voice review on the v0.23.2 plan caught five additional findings, ALL adopted into the corrected D4 architecture (orchestrator-stamped marker replaces content-pattern guard).

Cross-Model Consensus

Topic Initial plan Codex challenge Resolution
D1 heuristic shape frontmatter+slug heuristic misses real output (no slug in body) Adopted: explicit identity marker
D2 source of truth derive from _brain-filing-rules.json globs wrong surface — authorization vs identity Adopted: identity intrinsic to the page
--input bypass implicit on --input footgun, cached verdict re-opens loop Adopted: explicit --unsafe-bypass-dream-guard
Failure mode JSON load at phase entry no fail-closed contract Adopted: guard self-contained, no external load
Tests synthetic strings + alignment test don't prove regression Adopted: real Page → markdown round-trip

Plan Completion

All four decisions (D1+D2+D3+D4) bundled into v0.23.2 before merge — no deferred items.

Documentation

  • CLAUDE.md — Key files entries updated for the v0.23.2 marker-based self-consumption guard:
    • src/core/cycle/synthesize.ts: v0.23.2 renderPageToMarkdown exporting + dream_generated/dream_cycle_date frontmatter stamping, summary-page stamping, exported judgeSignificance/JudgeClient, verdictModel parameter wiring.
    • src/core/cycle/transcript-discovery.ts: v0.23.2 DREAM_OUTPUT_MARKER_RE + isDreamOutput() guard, bypassGuard option, stderr skip log, replaces v0.23.1's DREAM_OUTPUT_SLUGS content-prefix list.
    • src/commands/dream.ts: v0.23.2 --unsafe-bypass-dream-guard flag + plumbing through runCycle.synthBypassDreamGuard.
  • CHANGELOG.md v0.23.2 entry, voice rewritten for the marker-based architecture.
  • TODOS.md, README.md, AGENTS.md, docs/: no drift, preserved as-is.

Note on version slot

Master shipped its own v0.23.1 (PR #528, local CI gate / 4-tier wall-time optimization) while this PR was in flight. Master's v0.23.1 is unrelated to dream-cycle work. This PR claims v0.23.2.

Test plan

  • bun test test/cycle-synthesize.test.ts — 32/32 pass
  • All dream-related tests — 64/64 pass across 5 files in 12.5s
  • bun run build — clean compile after master merge
  • CHANGELOG voice matches project style; v0.23.2 entry stacks above master's v0.23.1
  • Codex outside voice + Claude /plan-eng-review both ran; all findings adopted
  • Full bun run test:e2e against Postgres test DB (deferred — touched-file E2E suite green; full E2E covered by master's new bun run ci:local)

🤖 Generated with Claude Code

Wintermute and others added 6 commits April 30, 2026 09:07
Built-in isDreamOutput() guard in transcript-discovery.ts auto-skips
any transcript whose first 2000 chars contain dream output slug prefixes
(wiki/personal/reflections/, wiki/originals/ideas/, wiki/personal/patterns/,
dream-cycle-summaries/). Prevents infinite recursion if dream output is
ever fed back into the corpus.

judgeSignificance() now accepts a verdictModel parameter, loaded from
dream.synthesize.verdict_model config key. Default: claude-haiku-4-5.

3 new test cases covering the guard.
…arker

The v0.23.1 prefix-string guard had two flaws caught by codex review.
serializeMarkdown does not embed the page slug into body content, so
the heuristic could miss real dream output. And real conversation
transcripts often cite brain slugs ("earlier I wrote about
wiki/personal/reflections/identity..."), so the heuristic dropped
legitimate transcripts silently.

Swap content inference for explicit identity. renderPageToMarkdown and
writeSummaryPage now stamp `dream_generated: true` + `dream_cycle_date`
into frontmatter at render time. Guard checks for the marker via
DREAM_OUTPUT_MARKER_RE (anchored at frontmatter open, BOM/CRLF
tolerant, scans first 2000 chars, word boundary on `true`). Cannot
drift, cannot false-positive on user text, cannot miss real output.

Tests built from a real Page → renderPageToMarkdown → isDreamOutput
round-trip (codex finding #5 — synthetic strings don't prove the
guard catches what synthesize actually produces). Coverage: regression
fixture, false-positive prevention on user transcripts citing slugs,
CRLF+BOM, whitespace/case variants, anchor-at-byte-0, perf bound,
bypass plumbing, dream_generatedfoo word-boundary check.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Explicit opt-in to disable the synthesize self-consumption guard. The
flag is intentionally NOT tied to --input — codex review caught that
implicit bypass is a footgun: any caller could synthesize a dream-
generated page directly via --input, get a cached positive verdict,
and silently re-trigger the loop bug.

Plumbing: dream.ts CLI parses the flag → DreamArgs.bypassDreamGuard →
runCycle({ synthBypassDreamGuard }) → SynthesizePhaseOpts.bypassDreamGuard
→ discoverTranscripts({ bypassGuard }) and readSingleTranscript.
Loud stderr warning at phase entry when set so the cost is visible.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces the v0.23.1 release notes with the v0.23.2 voice describing
the orchestrator-stamped marker approach and the --unsafe-bypass-dream-guard
flag.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
# Conflicts:
#	CHANGELOG.md
#	VERSION
#	package.json
Update CLAUDE.md Key Files entries for src/core/cycle/synthesize.ts,
src/core/cycle/transcript-discovery.ts, and src/commands/dream.ts to
reflect the v0.23.2 dream_generated frontmatter marker that replaces the
v0.23.1 content-prefix self-consumption guard, plus the new
--unsafe-bypass-dream-guard CLI flag.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@garrytan garrytan changed the title v0.23.1 fix: dream self-consumption guard + configurable verdict model v0.23.2 fix(dream): orchestrator-stamped self-consumption marker + verdict model tests May 1, 2026
garrytan and others added 4 commits April 30, 2026 22:52
CI's `build-llms generator > committed match generator output` guard
caught drift after the v0.23.2 doc-sync (commit 507edb1) updated three
Key Files entries in CLAUDE.md without re-running `bun run build:llms`.

The llms.txt index didn't drift (no new doc URLs); only the inlined
llms-full.txt bundle needed refreshing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three new PGLite E2E cases exercise the actual production loop scenario
end-to-end. Unit tests covered the bug class at the function-pair level
(renderPageToMarkdown → readSingleTranscript). These cover it at the
phase level: runPhaseSynthesize with a real engine, real putPage, real
renderPageToMarkdown, real corpus-dir discovery.

1. Leaked dream output is skipped on next synthesize run. The reflection
   page is inserted, reverse-rendered (which stamps `dream_generated:
   true`), dropped into the corpus dir as .txt, and the next phase run
   reports "no transcripts to process" with a stderr skip log. Verdict
   cache stays untouched so a future legit edit isn't shadowed by a
   stale cached "false".

2. bypassDreamGuard=true at phase entry re-enables ingestion. Same
   marked file gets discovered through the loud-warning path. Proves
   --unsafe-bypass-dream-guard plumbing reaches discoverTranscripts at
   phase scope.

3. Mixed corpus (leaked dream output + real conversation transcript)
   discovers exactly the real one. Pins codex finding #1's headline
   false-positive case: a transcript citing wiki/personal/reflections/
   in body must NOT be skipped.

Stderr capture via process.stderr.write spy with try/finally restore.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CI typecheck caught three TS2322 violations in the round-trip E2E
fixtures: 'reflection' is not a member of PageType. Reflections are
filed as 'note' in production (renderPageToMarkdown falls back to 'note'
for unknown types).

No behavior change — the guard test still exercises the same
serializeMarkdown → discoverTranscripts loop.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The pre-ship section listed `bun test` as the unit-test path but didn't
flag the trap: `bun test` (the bun runner) does NOT run TypeScript type
checking. Only `bun run test` (the npm script) does, because it chains
`bun run typecheck` + the four shell pre-checks before the runner.

CI on PR #527 caught a `'reflection'` literal that `PageType` doesn't
admit (PageType is a closed union). The runtime E2E and `bun test`
both passed locally because the runner doesn't gate on TS. The
separate typecheck stage in CI rejected it.

New rule: run `bun run typecheck` (or `bun run test`, which wraps it,
or `bun run ci:local` for the full gate) before pushing. The runner-
alone path is for hot-loop test iteration only.

Also regenerated llms-full.txt for the CLAUDE.md update.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@garrytan garrytan merged commit 579722d into master May 1, 2026
7 checks passed
garrytan pushed a commit that referenced this pull request May 1, 2026
Master moved from v0.23.1 → v0.23.2 (PR #527: orchestrator-stamped
self-consumption marker for the dream cycle + verdict-model unit tests).

Conflicts resolved:
- VERSION: kept this branch's 0.24.0 per CLAUDE.md branch-scoped rule.
- package.json version: kept 0.24.0.
- CHANGELOG.md: my v0.24.0 stays on top; master's new v0.23.2 entry
  spliced between v0.24.0 and v0.23.1. Sequence above v0.21.0 monotonic.

Verification:
- bun install — 0 new packages
- All 5 CI guards green: privacy + jsonb + progress + trailing-newline + wasm
- bun run typecheck — clean
- This branch's tests: 103/103 pass

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant