Skip to content

fix(build/orchestrator): test-fixer surfaces hygiene reason instead of "exited N" (Bug M)#110

Merged
anbangr merged 1 commit into
mainfrom
worktree-fix-bug-m-test-fixer-hygiene-attribution
May 27, 2026
Merged

fix(build/orchestrator): test-fixer surfaces hygiene reason instead of "exited N" (Bug M)#110
anbangr merged 1 commit into
mainfrom
worktree-fix-bug-m-test-fixer-hygiene-attribution

Conversation

@anbangr

@anbangr anbangr commented May 27, 2026

Copy link
Copy Markdown
Owner

Summary

Closes Bug MRUN_GEMINI_FIX failure handler at phase-runner.ts:825 was the lone holdout still calling renderRoleStepFailureMessage directly. Same legacy holdout that Bug I (PR #107) closed for RUN_GEMINI_TEST_SPEC; the test-fixer dispatcher was missed in that pass.

Plan: ~/.claude/plans/fixing-plan-bugs-k-through-n-post-pr-108.md

Canonical incident

agnt2-prototype-prodl2-f3-f4-soak-and-backup Phase 3.1 at 03:34 UTC. Codex test-fixer (gpt-5.5) ran for 101 seconds, exited 0 with no commit. Hygiene gate wrapped as exitCode=1 with body "test fixer did not create a new commit". Operator saw ✗ Phase 3.1 ... failed: test-fixer exited 1 — had to grep the hygiene log for the actual reason.

Fix

One-line swap: renderRoleStepFailureMessage("test-fixer", result)geminiExitError("test-fixer", result). geminiExitError extracts the first non-header line of the hygiene body when result.hygieneFailure === true and falls through to the legacy exited N; see <log> shape when no hygiene marker is present.

What changed

File Δ
build/orchestrator/phase-runner.ts +13/-1 (helper swap + 12-line Bug M comment)
build/orchestrator/__tests__/bug-m-test-fixer-hygiene-attribution.test.ts NEW: 5 tests (T-M1..T-M5), all pass

Architectural payoff

Bug Hygiene-aware attribution for Resolved by
I test-spec writer (RUN_GEMINI_TEST_SPEC) PR #107
M test-fixer (RUN_GEMINI_FIX) THIS PR

After Bug M lands, all role-step dispatchers that route through applyMutableAgentHygiene also route through geminiExitError. T-M4 pins the helper-adoption count at ≥4 (primary-impl + test-spec writer + test-fixer + dual-impl/judge) so a future regression shows up at review time.

Out of scope (separate concern)

The deeper architectural issue — Codex agents that report "Files changed:" but don't actually commit (a quality issue, not an orchestrator bug) — is deferred. A future fix would re-run tests after no-commit and detect "tests now green" as a success signal (the impl may already be correct). Bug M only addresses the operator-facing attribution.

Test plan

  • bun test build/orchestrator/__tests__/bug-m-test-fixer-hygiene-attribution.test.ts — 5/5 pass
  • bun test full suite — green
  • T-M5 negative guard ensures renderRoleStepFailureMessage("test-fixer", is gone from phase-runner.ts

Fork-versioning note

Per CLAUDE.md fork rule: fork-local skill work in build/orchestrator/ only. No VERSION/package.json/CHANGELOG.md bump.

🤖 Generated with Claude Code

…f "exited N" (Bug M)

Closes Bug M — `RUN_GEMINI_FIX` failure handler at phase-runner.ts:825
was the lone holdout still calling `renderRoleStepFailureMessage`
directly. Same legacy holdout that Bug I (PR #107) closed for
`RUN_GEMINI_TEST_SPEC`; the test-fixer dispatcher was missed in that
pass.

Canonical incident: agnt2-prototype-prodl2-f3-f4-soak-and-backup
Phase 3.1 (Extract tier resolution into public-rpc-proxy/tier) at
2026-05-27 03:34 UTC. Codex test-fixer (gpt-5.5) ran for 101 seconds,
called `git status --short` multiple times, emitted "Files changed:"
in its output, exited 0 with no commit. applyMutableAgentHygiene's
requireNewCommit gate wrapped the success as exitCode=1 with
hygieneFailure=true and body "test fixer did not create a new
commit". The pre-fix attribution surfaced as

  `✗ Phase 3.1 ... failed: test-fixer exited 1`

— the operator had to grep the hygiene log for the actual reason. Same
shape Bug I closed for test-spec writer; same fix applies.

Fix: swap `renderRoleStepFailureMessage("test-fixer", result)` →
`geminiExitError("test-fixer", result)` at phase-runner.ts:825. The
helper extracts the first non-header line of the hygiene body when
`result.hygieneFailure === true` and falls through to the legacy
`exited N; see <log>` shape when no hygiene marker is present.

Post-fix, four distinct dispatcher families in phase-runner.ts use
geminiExitError: primary-impl, test-spec writer (post-Bug-I),
test-fixer (this PR), and dual-impl/judge. Pinned via T-M4
(`geminiExitError` call count ≥ 4) so a future regression that swaps
any of them back to renderRoleStepFailureMessage shows up at review
time.

Tests (T-M1..T-M5 in
build/orchestrator/__tests__/bug-m-test-fixer-hygiene-attribution.test.ts):

- T-M1: hygieneFailure result → next.error names the hygiene reason
  ("did not create a new commit") and includes the agent log path
  for forensics. Legacy "test-fixer exited N" form MUST NOT appear.
- T-M2: vanilla nonzero exit (no hygiene marker) → legacy "exited N"
  shape preserved. Important: when codex genuinely crashed with
  exit 2, "exited 2" is the right signal — only hygiene-converted
  failures need the new attribution.
- T-M3: static-grep guard that RUN_GEMINI_FIX uses
  geminiExitError("test-fixer", result) literally.
- T-M4: static-grep guard that phase-runner.ts has at least 4
  geminiExitError callers (count pin so a future regression that
  swaps any of them back shows up at review time).
- T-M5: negative guard — `renderRoleStepFailureMessage("test-fixer",`
  must NOT appear anywhere in phase-runner.ts.

5/5 pass.

Architectural payoff:

| Bug | Hygiene-aware attribution for | Resolved by |
|---|---|---|
| I | test-spec writer (RUN_GEMINI_TEST_SPEC) | PR #107 |
| M | test-fixer (RUN_GEMINI_FIX) | THIS PR |

After Bug M lands, all role-step dispatchers that route through
applyMutableAgentHygiene also route through geminiExitError. The
"agent did nothing" symptom now carries a one-line reason naming the
actual gate it tripped, for every role.

Note on related architectural issue: the test-fixer hygiene gate firing
on no-commit-but-output-claimed-completion is a separate Codex quality
issue (the agent reported "Files changed:" but didn't actually commit
anything). The orchestrator fix here surfaces the right error
attribution; the agent-quality fix (re-run tests after no-commit and
detect "tests now green" as a success signal) is a deeper change
deferred to a follow-up. Bug M only addresses the operator-facing
attribution.

Plan reference: ~/.claude/plans/fixing-plan-bugs-k-through-n-post-pr-108.md

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@anbangr anbangr merged commit 3e4ab9e into main May 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant