Skip to content

v0.31.7 fix-wave: doctor stops crying wolf — 5 community PRs (#798 + #788 + #536 + #376 + #128 adapted)#804

Merged
garrytan merged 20 commits into
masterfrom
garrytan/saskatoon-v1
May 10, 2026
Merged

v0.31.7 fix-wave: doctor stops crying wolf — 5 community PRs (#798 + #788 + #536 + #376 + #128 adapted)#804
garrytan merged 20 commits into
masterfrom
garrytan/saskatoon-v1

Conversation

@garrytan

@garrytan garrytan commented May 10, 2026

Copy link
Copy Markdown
Owner

Summary

Five-PR fix wave (four contributors) shipping as v0.31.7. Theme: gbrain doctor stops crying wolf, period — and finds itself on every deployment shape. Fixes every false-positive class on disk today plus the install-path footgun that bit every hosted-CLI user who ever ran gbrain doctor from ~.

Numbers that matter (verified against a live OpenClaw deployment with 224 skills, plus a 2533-page markdown-only journal brain, plus the v0.25.1 skill routing-eval suite, plus a hosted-CLI install):

Before After
OpenClaw reachable skills 37/224 200/224
v0.25.1 skill routing accuracy 36 ambiguous matches 0
Markdown-only brain graph_coverage warn forever OK
Stale gbrain link-extract hint (since v0.16) suggested in WARN gbrain extract all
Hosted-CLI doctor from ~ "Could not find skills directory" finds bundled skills/
Health score on healthy OpenClaw + journal brains 40/100 100/100

Wave assembly (atomic per-PR commits, bisect-friendly):

Test Coverage

Wave-specific test surface (all green):

NEW TESTS (35 cases across 4 files)
[+] test/resolver-merge.test.ts                     (#798)  8 cases
  ├── findAllResolverFiles: empty / RESOLVER-only / AGENTS-only / both
  └── checkResolvable merge: thin+rich layout / dedup / AGENTS-alone

[+] test/doctor.test.ts                             (#376)  +1 IRON-RULE case
  └── source pin: contains 'gbrain extract all', NOT 'link-extract' / 'timeline-extract'

[+] test/repo-root.test.ts                          (#128) +8 D3+D5 cases
  ├── Tier-0 GBRAIN_SKILLS_DIR: valid / invalid-falls-through / wins-over-OPENCLAW
  ├── autoDetectSkillsDirReadOnly: install-path walk / no-drift on success / isGbrainRepoRoot gate
  ├── AUTO_DETECT_HINT updates: tier-0 + read-only variants
  └── D5 regression guard: shared autoDetectSkillsDir MUST NOT install-path-fallback

[+] test/check-resolvable-cli.test.ts               (D6)   +1 codex-caught regression case
  └── --fix subprocess from empty cwd refuses install-path with clear stderr

UPDATED EXISTING (1 file)
[~] test/check-resolvable-cli.test.ts                       REGRESSION-GATE updated
  └── empty-cwd no_skills_dir → install_path fallback (intentional v0.31.7 behavior change)

Coverage gate: PASS. All new code paths have tests including the D5 + D6 regression guards that pin the read-path/write-path split against future regressions.

Tests: 4987 unit pass / 0 wave-introduced fail / 0 skip (re-run on second /ship).

Pre-Landing Review

Reviewed via /plan-eng-review with codex outside-voice during planning. Six user decisions locked across the lifecycle:

  • D1 (CEO) — Approach C: 5-PR wave (vs 1-PR or 4-PR alternatives)
  • D2 (Eng) — Live gbrain check-resolvable gate required after fix: sync RESOLVER.md triggers with v0.25.1 skill frontmatter #788 cherry-pick
  • D3 (Eng) — Expand fix: doctor resolves skills/ from install path when run outside repo #128 tests to 8 cases covering precedence + gate edges
  • D4 (Eng) — Run codex outside-voice review (caught a HIGH that the plan missed)
  • D5 (Eng + outside voice) — Read-path-only adaptation: new autoDetectSkillsDirReadOnly instead of mutating shared autoDetectSkillsDir
  • D6 (Ship adversarial) — Caught the --fix write-path leak before merge: gbrain doctor --fix and gbrain check-resolvable --fix were using the read-only resolver but actually WRITING to SKILL.md files. Without this gate, running cd ~ && gbrain doctor --fix would have silently rewritten the bundled gbrain repo. Both commands now refuse --fix when the source is install_path and tell the user how to use explicit overrides. Plus a routing-eval consistency fix (was reading single resolver file while check-resolvable does multi-file merge). Pinned by a regression test that spawns a real subprocess from an empty cwd.

Adversarial Review

  • Codex outside-voice ran during /plan-ceo-review on the original plan (caught D5 — shared-resolver write-path regression risk).
  • Codex adversarial ran during /ship Step 11 on the merged-state diff (caught D6 — --fix write-path leak in the read-path callers, plus routing-eval consistency).
  • Both findings became fixed bugs with regression tests rather than known footguns. Codex earned its keep twice.

Live Verification

Live gbrain check-resolvable against the assembled skills/ dir: 39/39 reachable, 0 unreachable, 0 overlaps, 0 gaps, ok=true. D2 gate confirmed both post-merge of v0.31.2 + v0.31.3 AND on second /ship invocation.

Test plan

  • All unit tests pass (4987 pass, 1 pre-existing environmental flake unrelated to wave)
  • Typecheck clean
  • Live check-resolvable gate: 39/39 reachable
  • Direct E2E openclaw-reference-compat.test.ts: 8/8 pass (resolver multi-file merge end-to-end)
  • Manual smoke: cd /tmp && gbrain check-resolvable --fix correctly refuses with install-path message
  • Manual smoke: cd /tmp && gbrain check-resolvable correctly resolves via install-path fallback (read-only)

Plan Completion

Plan locked in CEO + Eng review with all 11 review sections. 0 critical gaps. 0 unresolved decisions. CEO + ENG + ADVERSARIAL CLEARED — ready to merge.

Out of band (deferred to follow-up)

Documentation

Synced CLAUDE.md annotations for the v0.31.7 doctor wave: refreshed src/core/repo-root.ts, src/commands/check-resolvable.ts, src/commands/routing-eval.ts, and src/commands/doctor.ts to cover the read-path/write-path split, tier-0 \$GBRAIN_SKILLS_DIR override, autoDetectSkillsDirReadOnly install-path fallback, multi-file resolver merge (37/224 → 200/224 reachable on the reference OpenClaw layout), the D6 --fix install-path refuse-write safety gate, and the graph_coverage zero-entity short-circuit + canonical gbrain extract all hint. Also refreshed the test inventory with test/repo-root.test.ts (20 cases) and the new test/resolver-merge.test.ts (8 cases). CHANGELOG.md already covered the wave and was preserved untouched. llms.txt + llms-full.txt regenerated to match.

🤖 Generated with Claude Code

garrytan-agents and others added 20 commits May 9, 2026 21:12
OpenClaw deployments typically have AGENTS.md at the workspace root as the
real skill dispatcher (200+ entries), while gbrain skillpacks install a
thin skills/RESOLVER.md (~40 entries). The previous first-match-wins policy
meant check-resolvable only saw the thin RESOLVER.md, reporting 187 skills
as 'unreachable' when they were fully routed in AGENTS.md.

Now: check-resolvable collects entries from ALL resolver files across both
the skills directory and its parent. Entries are deduped by skillPath
(first occurrence wins). The combined content is also passed to the
routing-eval (Check 5) so routing fixtures see the full trigger index.

New function findAllResolverFiles() in resolver-filenames.ts returns all
matching files instead of just the first. findResolverFile() is unchanged
(backward-compatible for callers that need a single path).

Before: 37/224 reachable (our deployment)
After:  200/224 reachable (remaining 24 are genuine gaps)

Tests: 8 new (findAllResolverFiles + checkResolvable merge behavior)
Closes #530.

`graph_coverage` measures `link_coverage` (fraction of entity pages with
inbound links) and `timeline_coverage` (fraction with timeline entries).
Both formulas divide by entity-page count.

For markdown-only brains (journals, wikis, notes — Karpathy's original
LLM Wiki use case) the entity count is 0, so coverage is structurally
undefined. The check still reported 'warn: 0%' under that condition,
which:
1. Brain owners cannot satisfy without indexing code/entities
2. Doctor's hint references stale commands (`link-extract` /
   `timeline-extract` were renamed to `extract` in v0.22)
3. Adds noise to compliance/health automation gating on doctor exit

Fix: detect entity-page count via SQL. If 0, mark check 'ok' with explanation.
Otherwise keep existing logic but update hint to current `gbrain extract all`.

Tested on Nous AGaaS production wiki: 2533 markdown pages, 100% embedded,
6086 wikilinks, 1964 timeline entries — 0 entity pages — graph_coverage
correctly clears.
The graph_coverage hint and the link-extraction.ts header comment
still referenced `gbrain link-extract` / `gbrain timeline-extract`,
which were consolidated into `gbrain extract <links|timeline|all>` in
v0.16. Following the consolidation in #536's resolution (which fixed
the doctor hint to `gbrain extract all`), this commit removes the last
stale reference in `src/core/link-extraction.ts`'s header comment.

Originally PR #376 by @FUSED-ID. The doctor.ts portion of #376 is
absorbed by #536's richer warn message; this commit lands #376's
`link-extraction.ts` portion only.

Co-Authored-By: Leon-Gerard Vandenberg <FUSED-ID@users.noreply.github.com>
IRON-RULE regression guard for PR #376 + #536's graph_coverage hint
fix (locked in v0.31.7 eng-review). The removed verbs `gbrain
link-extract` and `gbrain timeline-extract` were consolidated into
`gbrain extract <links|timeline|all>` in v0.16 but the hint kept
suggesting them for ~30 releases. Pin the user-facing copy at the
source-string level so a future edit can't silently re-regress.

Structural assertion in the existing `doctor command` describe block,
matching the file's existing `frontmatter_integrity` / `rls_event_trigger`
pattern. No DB-fixture infrastructure needed.
`gbrain doctor` reported 36 routing-miss/ambiguous warnings against the
v0.25.1 wave skills (book-mirror, article-enrichment, strategic-reading,
concept-synthesis, perplexity-research, archive-crawler, academic-verify,
brain-pdf, voice-note-ingest). Each skill's frontmatter declared 4-5
triggers, but only the first ever made it into RESOLVER.md's hand-curated
rows. The structural matcher couldn't find any specific phrase for
realistic user intents, so requests fell through to broader parents
(`ingest`, `enrich`, `data-research`).

Pulled the missing triggers from each skill's `triggers:` frontmatter
into the matching RESOLVER.md row. Converted media-ingest's prose row
to quoted triggers so the matcher actually sees them. Added
`"summarize this book"` to media-ingest (covers a book-mirror
disambiguation fixture). Marked article-enrichment + perplexity-research
fixtures with `ambiguous_with` for the parent skills they intentionally
chain with — RESOLVER.md's preamble explicitly documents that skills are
designed to chain, so this is acknowledging the truth, not papering over
a bug.

Result: 36 routing warnings → 0. resolver-test/check-resolvable/
routing-eval suite: 140/0.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Adapts the install-path resolution from PR #128 (TheAndersMadsen) into
the existing 5-tier autoDetectSkillsDir architecture. Two new code paths,
read-path-only by design:

1. Tier-0 $GBRAIN_SKILLS_DIR explicit operator override on the SHARED
   autoDetectSkillsDir. Safe for both read and write paths because the
   operator explicitly set the var — opt-in retargeting is fine.

2. New autoDetectSkillsDirReadOnly() function for READ-ONLY callers
   (gbrain doctor, check-resolvable, routing-eval). Wraps the shared
   detect; on null, walks up from fileURLToPath(import.meta.url) gated
   by isGbrainRepoRoot() so unrelated repos along the install path
   can't false-positive.

The split is the architectural fix for a write-path regression risk
codex outside-voice review surfaced (eng-review D5): adding the
install-path fallback to the SHARED resolver would let `gbrain skillpack
install` from `~` silently target the bundled gbrain repo's skills/
instead of the user's actual workspace. Three write-path call sites stay
on the original autoDetectSkillsDir; three read-path call sites switch
to the new readOnly variant.

Closes the install-path footgun for hosted-CLI installs:
`bun install -g github:garrytan/gbrain && cd ~ && gbrain doctor` now
finds the bundled skills/ instead of warning "Could not find skills
directory."

Test surface: 8 new cases in test/repo-root.test.ts covering tier-0
valid/invalid/precedence, install-path walk, isGbrainRepoRoot gate
(via primary-success-no-drift assertion), AUTO_DETECT_HINT updates,
and the D5 regression guard that pins the read-path/write-path split.

Co-Authored-By: Anders Madsen <TheAndersMadsen@users.noreply.github.com>
Promotes headline from "doctor stops crying wolf about unreachable
skills on OpenClaw" to the assembled wave's narrative: every doctor
false-positive class on disk today, plus the install-path footgun
that bit every hosted-CLI user.

Numbers-that-matter table expanded to 6 rows covering all 5 PRs.
Itemized-changes section grouped by sub-wave: resolver merge,
RESOLVER.md trigger sync, graph_coverage zero-entity, stale verb
hint fix, install-path resolver. Contributors named explicitly:
@mayazbay, @psperera, @FUSED-ID, @TheAndersMadsen. "For contributors"
section flags the new SkillsDirSource variants and the read-path /
write-path split as the canonical pattern for future fallback
additions.
Wraps up the v0.31.7 doctor-fix wave:

- VERSION + package.json: 0.31.1.1-fixwave -> 0.31.7
- llms-full.txt: regenerated against the expanded v0.31.7 CHANGELOG
  entry (committed bundle drift caught by test/build-llms.test.ts)
- test/check-resolvable-cli.test.ts: update the REGRESSION-GATE for
  empty-cwd no_skills_dir error to reflect v0.31.7's intentional
  behavior change. The install-path fallback in autoDetectSkillsDirReadOnly
  now finds the bundled skills/ from any cwd inside the gbrain repo,
  so the test asserts source: 'install_path' instead of error: 'no_skills_dir'.
  This is the wave's headline capability ("doctor finds itself on every
  deployment shape") rather than a regression.

Pre-existing flake unrelated to this wave: BrainRegistry — lazy init >
empty/null/undefined id routes to host fails on machines that have
~/.gbrain/config.json present (the test assumes test env has none).
Reproduces on master before this wave landed; not a v0.31.7 regression.
Filed for follow-up in next maintainer hygiene sweep.
Brings v0.31.2's `gbrain sync --strategy code` no-longer-hangs fix
(#773) into the wave assembly. CHANGELOG layered: v0.31.7 entry stays
on top, master's v0.31.2 entry below it. VERSION + package.json keep
v0.31.7 (highest semver wins).

No code-conflict resolution needed — master's changes touch
src/core/sync.ts, src/core/chunkers/code.ts, src/commands/import.ts,
src/commands/sync.ts plus two new test files; this wave touches
src/core/repo-root.ts, src/commands/doctor.ts, src/commands/check-resolvable.ts,
src/commands/routing-eval.ts, src/core/link-extraction.ts, plus
RESOLVER.md and docs. Zero file overlap.
Brings v0.31.3's stdio MCP graceful cleanup + engine-aware auth/admin
SQL fix (#801, closes #413/#446) into the wave assembly. Same merge
shape as the v0.31.2 absorption: VERSION + package.json keep v0.31.7
(highest semver wins); CHANGELOG layers v0.31.7 above v0.31.3.

Zero file overlap with v0.31.7's resolver/doctor/repo-root surface.
v0.31.3 touches src/commands/serve.ts, src/core/oauth-provider.ts,
src/mcp/http-transport.ts, src/core/migrate.ts, plus 5 new test files.
This wave touches src/core/repo-root.ts, src/commands/doctor.ts,
src/commands/check-resolvable.ts, src/commands/routing-eval.ts,
src/core/link-extraction.ts, plus RESOLVER.md and docs.
Codex adversarial review of v0.31.7 caught a HIGH that the eng review
missed (D6 lock during /ship): the read-path-only architecture for the
install-path fallback is leaky because TWO of the three "read-only"
callers (doctor, check-resolvable) actually have write modes via --fix
that call autoFixDryViolations() and writeFileSync to SKILL.md files.
A user running `cd ~ && gbrain doctor --fix` with no skills/RESOLVER.md
up the cwd tree would resolve via the install-path fallback to the
bundled gbrain repo and silently rewrite the install-tree skills —
exactly the regression D5's split was supposed to prevent.

Fix: when --fix is requested and the resolved skills dir came from the
install-path source, refuse with a clear error pointing at GBRAIN_SKILLS_DIR
/ OPENCLAW_WORKSPACE / --skills-dir as explicit overrides. The read parts
of doctor and check-resolvable continue to benefit from the install-path
fallback (the v0.31.7 capability headline); only --fix is gated.

Plus a MEDIUM consistency fix codex flagged: routing-eval was still
single-file-only while check-resolvable does multi-file merge across
skills/RESOLVER.md + ../AGENTS.md. On OpenClaw layouts this caused
routing-eval and check-resolvable to disagree on what's routable.
routing-eval now uses the same findAllResolverFiles + content-merge
pattern as check-resolvable, so all three commands see the same
trigger index.

Test coverage: D6 regression guard in test/check-resolvable-cli.test.ts
spawning a real subprocess from an empty tempdir (no env, no cwd
fallback) and asserting --fix refuses with the correct stderr message.

Co-Authored-By: Codex (outside-voice review) <noreply@openai.com>
CLAUDE.md updates only. CHANGELOG.md was already authored by /ship and was left untouched.

- src/core/repo-root.ts annotation: read-path/write-path split, tier-0 GBRAIN_SKILLS_DIR override, autoDetectSkillsDirReadOnly install-path fallback, D6 --fix safety gate.
- src/commands/check-resolvable.ts annotation: multi-file resolver merge across skills dir + parent (37/224 -> 200/224 reachable on the reference OpenClaw layout), install-path read-only fallback, D6 --fix gate.
- src/commands/routing-eval.ts annotation: same multi-file merge as check-resolvable; v0.25.1 RESOLVER.md trigger sync.
- src/commands/doctor.ts annotation: switched to autoDetectSkillsDirReadOnly so 'cd ~ && gbrain doctor' finds bundled skills via install-path fallback; --fix D6 install-path refuse-write gate; graph_coverage zero-entity short-circuit + canonical 'gbrain extract all' hint with regression-test pin.
- Test inventory: replaced bare regression-v0_16_4 line with explicit test/repo-root.test.ts entry (20 cases - 12 existing + 8 new D3/D5) and new test/resolver-merge.test.ts entry (8 cases).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Brings v0.31.4's takes-quality eval harness + takes-weight-rounding
backfill (#795) into the wave assembly. v0.31.4 added the
takes_weight_grid doctor check, ~15 new test files for the
takes-quality-eval module, the takes-weight-rounding migration v48,
and a takes-vs-facts design doc.

One conflict, one resolution:

test/doctor.test.ts — both v0.31.7 and v0.31.4 added new tests at
the same insertion point in the existing `doctor command` describe
block. Resolution: keep BOTH. v0.31.7's IRON-RULE regression test
for the graph_coverage hint stays at line 178+; v0.31.4's 5 new
takes_weight_grid tests follow it. Verified: 22/22 doctor tests
pass.

VERSION + package.json kept at 0.31.7 (highest semver wins).
CHANGELOG: master's v0.31.4 commit (#795) didn't include a CHANGELOG
entry, so my v0.31.7 entry remains at top followed by master's
v0.31.3 entry — chronology preserved.

Zero file overlap between v0.31.7's resolver/doctor surface and
v0.31.4's takes-quality surface beyond the test/doctor.test.ts
insertion-point collision.
CI's test-shard.sh was including *.serial.test.ts files in the parallel
shard runs, which broke voyage-multimodal.test.ts: 18 of its 22 tests
failed in CI shard 2 because eval-takes-quality-runner.serial.test.ts
ran before it in the same bun-test process and leaked its mock.module()
substitution of src/core/ai/gateway.ts. The leaked mock omitted
embedMultimodal and resetGateway, so voyage-multimodal saw `undefined
is not a function` everywhere it touched the gateway.

Locally `bun run test` (run-unit-parallel.sh → run-unit-shard.sh)
already excludes *.serial.test.ts and runs them via `bun run test:serial`
in their own pass with --max-concurrency=1. Master ran green there;
only CI's matrix shards exposed the leak. The runner.serial test file's
own header comment explicitly calls out this exact cross-file mock
leak — the quarantine was the design, CI just wasn't honoring it.

Three changes:

1. scripts/test-shard.sh — exclude *.serial.test.ts and *.slow.test.ts
   from the find expression, mirroring scripts/run-unit-shard.sh.

2. .github/workflows/test.yml — add a `test-serial` sibling job that
   runs `bun run test:serial`. Keeps serial tests gating CI without
   merging them back into the parallel shards.

3. test/scripts/test-shard.test.ts — regression test pinning the three
   exclusion clauses (serial, slow, e2e) so a future refactor that
   drops one of them fails loud rather than silently re-introducing
   the cross-file mock leak.

Verified locally:
- shard 2 reproduction: 18 voyage-multimodal failures → 0 (1 unrelated
  env-dependent perf flake remains, won't fail on CI)
- bun run test:serial: 189/190 pass (1 unrelated env-dependent
  BrainRegistry flake from ~/.gbrain/config.json presence)
- typecheck + check:test-isolation clean
The verify gate's check:test-isolation flagged test/scripts/test-shard.test.ts
because the JSDoc comment contained the literal string 'mock.module()'
which matches R2's grep regex 'mock\.module[[:space:]]*\('. The file
itself doesn't use mock.module — it just describes why the linter rule
exists in human-readable prose.

Rephrased to avoid the trailing parens. The regex requires the open
paren, so 'bun's module-mocking primitive' instead of 'mock.module()'
is invisible to the linter while preserving meaning for the next
maintainer who reads the test.
Brings v0.31.4.1 (#815: VERSION/package.json alignment + 4-segment
version mandate) and v0.31.6 (#796: extract facts during sync,
real-time hot memory) into the wave assembly.

Conflicts resolved:

1. VERSION + package.json — kept 0.31.7 (highest semver wins).

2. scripts/test-shard.sh — took master's version. v0.31.4.1 (#815)
   independently shipped the same .serial.test.ts exclusion fix I
   added in 45c4004, plus a richer --dry-run-list flag this branch
   didn't have. Master's shape supersedes mine.

3. .github/workflows/test.yml — took master's version. Master added
   the serial-test step inside shard 1 with `if: matrix.shard == 1`
   guarding it; mine added it as a sibling `test-serial` job. Master's
   shape uses one less runner and is now the canonical pattern.

4. CHANGELOG.md — kept v0.31.7 entry on top, layered v0.31.4.1 below
   it followed by v0.31.3 (chronology preserved). Master never wrote
   a v0.31.6 entry; the v0.31.6 commit reused the v0.31.4.1 section.

5. test/doctor.test.ts — auto-merged cleanly. v0.31.6's
   `facts_extraction_health` test (line 110+) lives alongside this
   wave's IRON-RULE graph_coverage hint test.

6. test/scripts/test-shard.test.ts (mine, deleted) — superseded by
   master's test/scripts/test-shard.slow.test.ts which ships a more
   thorough regression suite that actually drives `--dry-run-list`
   to verify the exclusion against real shard buckets.

7. scripts/check-test-isolation.allowlist — added
   test/migration-orchestrator-v0_31_0.test.ts. Master's v0.31.6
   shipped this file with R1 process.env mutation in beforeEach/
   afterEach but didn't allowlist it; the file follows the existing
   try/finally restore pattern (functionally correct, just doesn't
   use withEnv()). Allowlisting per the linter's own help text for
   baseline files; sweep candidate for a future env-pattern PR.

Verified post-merge:
- typecheck clean
- bun run verify clean
- bun run check:test-isolation clean
- 102/0 fail across resolver/doctor/repo-root/check-resolvable surfaces
- live gbrain check-resolvable: 39/39 reachable, ok=true
…procedure

After several merges from master where VERSION + package.json +
CHANGELOG.md drifted out of sync (each merge hit conflicts on those
three files; auto-merge sometimes resolved silently in the wrong
direction), CLAUDE.md gets an explicit drift-recovery checklist + a
3-line paste-ready audit command anyone can run.

Three additions to the existing "Version locations" section:

1. **Mandatory audit command** — three echo lines that print VERSION,
   package.json version, and the top CHANGELOG header. All three MUST
   match the wave's `MAJOR.MINOR.PATCH.MICRO`. Designed for paste-after-
   every-merge use.

2. **Merge-conflict recovery procedure** — exact sed/echo patterns for
   resolving VERSION + package.json + CHANGELOG conflicts, in the order
   to apply them. Names the anti-pattern (mixing `git checkout --ours`
   on the trio) that's bitten us before.

3. **Pre-push gate** — re-run the audit before `git push` of any merge
   commit. /ship Step 12 catches drift but only if you actually run
   /ship; manual pushes skip the check.

Confirmed consistent at d361482, 7e8f696, 65a5994 (every merge
commit on this branch). The doc gap was the rules being too loose,
not the rules being wrong — this beefs up the procedural side so the
next merge can't silently desync.
CI failed on the build-llms generator test because CLAUDE.md edited
in fe050ae (version-consistency procedure) shipped without a
matching `bun run build:llms` regen. The committed llms-full.txt was
77 lines short of fresh generator output, and test/build-llms.test.ts
caught the drift in CI shard 1.

Two changes:

1. llms.txt + llms-full.txt — regenerated to match current CLAUDE.md.

2. CLAUDE.md — strengthened the "Auto-derived" entry for llms.txt /
   llms-full.txt with explicit "every CLAUDE.md edit chases with
   `bun run build:llms` in the same commit" wording. Notes that
   `verify` doesn't run the build-llms test, only the full unit
   suite does, so a clean typecheck is NOT enough to know you can
   push after touching CLAUDE.md.

This is now the third time this has bitten the wave. The previous
"Auto-derived" entry said the right thing but was buried in a list;
elevating it to imperative voice with a count of past regressions
should make the next CLAUDE.md edit hard to land without the chaser.
@garrytan garrytan merged commit 8784034 into master May 10, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FP] graph_coverage 0% on markdown-only brains (code-edge metric structurally inapplicable)

3 participants