fix(build/orchestrator): re-align state.phases when feature-review inserts mid-array (build skill v1.22.3)#43
fix(build/orchestrator): re-align state.phases when feature-review inserts mid-array (build skill v1.22.3)#43anbangr wants to merge 5 commits into
Conversation
…serts new phases When feature-review returns FEATURE_NEEDS_PHASES, plan-mutator.appendFeaturePhases splices new "### Phase N.review-K" headings under the named feature heading (not at end-of-file). The parser walks top-to-bottom assigning phaseIndex = phases.length, so every downstream PhaseState index shifted. cli.ts:9063 used reparsed.phases.slice( oldPhaseCount), which assumed append; it captured the wrong tail (the actually-last phase of the LAST feature) and pushed it as the "new" phase. state.phases[k] became stale for every k >= insertion point AND gained a duplicate of the last phase with a conflicting .index. The orchestrator then ran the new review against a state slot whose .number still described the OLD phase at that index, which is the "state.phases[3].number = 2.1 but gemini.outputPath = phase-1.review-1-*" symptom. Fix: - state.ts: new mergeReparsedPhases(state, reparsed) — diff-merges by phase.number, preserving PhaseState identity (status, gemini outputs, codexReview records) and rewriting per-phase .index to match parser order. Syncs featureState.phaseIndexes from the parser. Drops orphan state phases (no longer in plan) with a warning. - cli.ts (phases_added branch): replaced the slice() merge with mergeReparsedPhases. Removed the now-redundant featureDef rebinding gymnastics — featureState.phaseIndexes is synced inside the merge. - cli.ts (resume path): added repair guard after loadState + backfillKindFromPlan. When loaded state.phases disagrees with the freshly parsed plan (length OR per-index number mismatch), re-align via the same merge. Heals on-disk corruption left by the pre-fix code paths automatically — users hit by the bug recover by re-running. Regression test (build/orchestrator/__tests__/phases-added-merge.test.ts) pins: - Mid-array splice with downstream PhaseState identity preserved (gemini, committedAt, status all survive across the merge). - Append at last feature (the OLD math also handled this; the new math must too). - Multiple review phases inserted in one verdict. - Orphan detection + drop. Suite: 1297 pass / 2 skip / 1 timing-flake in sub-agents.test.ts (passes in isolation, unrelated to this change). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Address informational findings from Testing and Maintainability specialists during /ship pre-landing review. All findings auto-fixed except the feature-array asymmetry (M3), which the user approved as Option B (extend report shape now rather than defer). state.ts (mergeReparsedPhases): - Extract arePhasesAligned() predicate so the resume-time trigger and the merge function share a single agreement contract (M1). - Backfill kind on existing PhaseStates symmetrically with the new-entry branch (default to "code" when parser supplies none) so post-merge invariant is uniform (M4). - Sync features by number, not array position, surfacing addedFeatures and orphanedFeatures in MergeReparsedPhasesReport. Production never adds or removes top-level features today, but the asymmetry was a latent gap and the report now reports both deltas (M3 / option B). - Document last-write-wins semantic on duplicate phase numbers and the hands-off contract on currentPhaseIndex / currentFeatureIndex. cli.ts: - Use arePhasesAligned() in the resume guard instead of an inline predicate. - Extract logMergeReport(report, context) helper so the resume-time repair and the phases_added path emit identical, greppable [plan] log shapes with a context tag distinguishing the call site (M2). tests: - Add makeSentinelInvocation() typed helper to replace `as unknown as` double-casts on SubAgentInvocation sentinels (M_test). - Pin duplicate-number collapse semantics: plant distinct sentinels on two same-numbered PhaseStates and assert the second one (last-write) wins (T1). - Assert mergeReparsedPhases does NOT mutate currentPhaseIndex / currentFeatureIndex (T3 — cursor preservation contract). - New test for feature-level orphan/added drift surfaced via the extended report shape. - Update existing assertions to verify addedFeatures and orphanedFeatures are empty in the happy-path cases. Suite: 1300 pass / 2 skip / 0 fail (4066 expects). The earlier sub-agents timing flake also cleared on this run. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…number defense Address 3 HIGH findings from cross-model adversarial review (Claude subagent + Codex structured review). The biggest concern: the resume-time auto-repair I introduced in the previous commit preserves runtime artifacts (gemini.outputFilePath, codexReview, committedAt) under whatever `number` is stored in state.json. But state.json corrupted by the pre-fix /build has the artifacts persisted at slots whose `number` is wrong — by-number merge re-attributes those artifacts to the wrong phase. If a downstream --mark-phase-committed then marks the wrong phase done, the real phase silently skips its real work. Same silent-corruption shape as the bug being fixed. Resolution per user-approved option A: the resume guard becomes fail-closed. On detected disagreement, gstack-build now exits with code 2 and prints a clear remediation message (delete state, --no-resume, or manually realign). Users decide what to keep; no silent heal. The in-run phases_added merge stays auto: there, state.phases[i] genuinely holds the prior phase's runtime state because the splice hasn't been re-parsed yet. By-number merge is the correct semantic. state.ts: - arePhasesAligned() now also checks feature-level alignment (count + per-index number). Resume desync of features alone (user renamed a `## Feature N:` heading) no longer slips past the guard. - mergeReparsedPhases() throws on duplicate phase numbers from the parser, defending against aliasing where two same-numbered phases would share a single PhaseState object across two array slots. - Document the contract: cli.ts does NOT auto-invoke the merge on resume; the function is wired only on the in-run phases_added path. cli.ts: - Resume guard: on disagreement, print a remediation block and exit 2 via the existing setupFailed flow. No mergeReparsedPhases call on resume. The error message names all four counts (state vs parsed, for phases and features), names the root cause (pre-fix gstack or hand-edit), and lists three recovery paths. - statePath imported so the remediation message can name the on-disk state file explicitly. - logMergeReport() kept; doc updated to note "resume" context tag is reserved for future use (not currently wired now that resume is fail-closed). tests: - New describe block for arePhasesAligned covering phase-count drift, per-index phase-number drift, feature-count drift, and feature-number drift with phases preserved. - New test for duplicate-number rejection (asserts the thrown error). - Resume-path test renamed and rescoped: it now exercises the PURE function's behavior (the shape of the merge if called) without claiming this is the runtime recovery path. Comment makes the contract clear. Cross-model adversarial review findings addressed: - Claude F1 / Codex Med#2 (failedAtPhase desync): moot — fail-closed resume means the merge never runs on shifted on-disk state. - Codex High#1 (resume preserves shifted artifacts): fixed by fail-closed. - Codex High#2 (duplicate phase aliasing): fixed by parser-side duplicate detection. - Codex High#3 / Claude F11 (arePhasesAligned ignores features): fixed. - Codex Med#1 (corrupted .index survives valid JSON loads): moot on resume (fail-closed) and impossible on in-run merge (.index is rewritten to i unconditionally). - Codex Med#2 / Claude F5 (last-write-wins duplicate collapse drops recoverable state): moot — resume is fail-closed, so any disk duplicates surface as a fail-closed error rather than silent collapse. Suite: 1305 pass / 2 skip / 1 pre-existing flake in sub-agents.test.ts (timeout-fix integration block — same shape on main, unrelated to this diff; passes intermittently). Regression test now: 12 pass / 84 assertions. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signal that the build skill's orchestrator runtime changed in this branch (FEATURE_NEEDS_PHASES merge fix + fail-closed resume guard). No top-level VERSION bump per Fork versioning rule — fork-local skill work bumps skill frontmatter only. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Cross-reference the FEATURE_NEEDS_PHASES mid-array merge fix in the two canonical docs that future debuggers / orchestrator readers go to first. docs/orchestrator-state-machine.md - Add Inv F (phases-array alignment with parsed plan), naming arePhasesAligned as the predicate and pointing at the two paths that can violate it (in-run splice, cross-run state.json desync). - Add §9: mergeReparsedPhases contract — strategy, last-write-wins on pre-existing duplicates, parser-side duplicate rejection, hands-off cursor, in-run-only wiring. Names the regression test file. - Add §10: resume-time alignment guard (fail-closed) — exit code 2 conditions, remediation paths printed to the user, and the silent-corruption rationale for refusing to auto-heal on resume. - Refresh the §11 glossary line numbers (logMergeReport pushed every cli.ts pointer down) and add arePhasesAligned, mergeReparsedPhases, and the resume-guard call site. build/README.md - Add a "Resume-time state/plan desync (exit 2)" subsection to Failure Handling describing the new fail-closed error, the three recovery options the orchestrator prints, and a back-reference to state-machine §10 for the rationale. No code changes. Diataxis quadrant: reference (state machine) + how-to (recovery in build/README.md). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Closing — PR #42 (#42) landed the same FEATURE_NEEDS_PHASES merge fix earlier today (2026-05-18T13:40Z) while this branch was in development. Both branches independently identified the same root cause (slice-tail merge assumes new phases append, but PR #42's helper is This PR's additive value over what's already landed:
These will be filed as a follow-up PR that compounds on PR #42's landed code, not a competing implementation. 🤖 Generated with Claude Code |
Summary
Fixes a silent state-corruption bug in
gstack-buildwherestate.phases(the persisted JSON array of per-phase runtime state) gets mis-aligned with the parsed plan markdown when feature-review returnsFEATURE_NEEDS_PHASESand the named feature is not the last feature in the plan.Root cause: build/orchestrator/cli.ts:9063 (pre-fix) used
reparsed.phases.slice(oldPhaseCount)to identify "new" phases, assuming new entries append to the END of the parser's array. They don't.appendFeaturePhasessplices new### Phase N.review-Kheadings inside Feature N's body (before the next## Featureheading). The parser walks top-to-bottom assigningphaseIndex = phases.length, so every PhaseState downstream of the insertion point shifts in the parser's view while the slice captures the wrong tail. Result:state.phases[i].numberbecomes stale for every i ≥ insertion point ANDstate.phasesgains a duplicate of the actually-last phase with a conflicting.index. The orchestrator runs the new review against a slot whose.numberstill describes the OLD phase at that index — exactly thestate.phases[3].number = "2.1"butgemini.outputPath = "phase-1.review-1-*"symptom in the bug report.Fix: Replace the slice-based merge with
mergeReparsedPhasesin build/orchestrator/state.ts — a diff-merge by phasenumberthat splices new phases intostate.phasesat the parser-assigned position and preserves PhaseState identity (status,gemini,codexReview,committedAt). Add a fail-closed resume guard in cli.ts: on detected state/plan desync at startup, exit with code 2 and print a remediation message instead of silently re-attributing runtime artifacts to the wrong phase.Bumps:
buildskill frontmatter1.22.2 → 1.22.3. No top-level VERSION bump (Fork versioning rule).Test Coverage
Coverage: 100%. All paths in
mergeReparsedPhases,arePhasesAligned, and the resume guard have direct tests in build/orchestrator/tests/phases-added-merge.test.ts (12 tests / 84 assertions).Coverage gate: PASS (100%).
Pre-Landing Review
9 INFORMATIONAL findings from 4 specialists (Testing, Maintainability, Security, Performance). 0 CRITICAL.
as unknown asdouble-casts; duplicate-collapse semantic now pinned with distinct sentinels; cursor-preservation assertions added; feature-asymmetry test added.arePhasesAlignedextracted,logMergeReporthelper, kind defaulting symmetrized, comment ordering), 1 ASK answered as B (extend report shape withaddedFeatures/orphanedFeaturesnow rather than defer).PR Quality Score: 5.5/10 (penalized for surfaced informational findings; all addressed).
Adversarial Review
Cross-model adversarial review (Claude subagent + Codex
codex exec+ Codexcodex review) surfaced 3 HIGH findings. All addressed in commit e2781eb9:gemini.outputFilePath,codexReview) belong to the work that physically ran at that slot, not to the phase whosenumberwas stored there in corrupted JSON. By-number merge would silently re-attribute work; downstream--mark-phase-committedwould mark the wrong phase done. Resolved per user-approved Option A: resume guard now FAILS CLOSED. Exits with code 2 and prints a 3-option remediation message (--no-resume, delete state, manual realign). User decides what to keep.PhaseStateobject across multiple array slots (Codex High feat(build): TDD integration — Red→Green enforced by state machine (v1.14.0) #2). Defended:mergeReparsedPhasesnow throws on parser-side duplicatenumbervalues.arePhasesAlignedignored feature-level drift (Codex High feat(dual-impl): Gemini + Codex tournament selection with Opus judge (gstack-build v1.15.0) #3 / Claude F11). Extended to check bothstate.phases(length + per-indexnumber) ANDstate.features(length + per-indexnumber).Cross-model agreement on the resume-corruption concern was the strongest signal — both reviewers independently flagged the same silent-corruption shape as the bug being fixed.
CODEX (code review) gate: PASS (after fixes).
Plan Completion
No plan file detected. The branch shipped from a bug report (the user's screenshot/markdown). All items from the bug report's "Proposed fix" section addressed:
number, not slice) ✓saveState+fs.renameSync) ✓The alternative the report mentioned ("change
state.phases[]to be derived not persisted") was explicitly NOT taken — the fix preserves the persistence model and just fixes the alignment.Verification Results
No verification section in the plan (no plan file). Step 8.1 skipped.
Documentation
Doc diff preview
docs/orchestrator-state-machine.md: added Inv F (phases-array alignment with parsed plan) namingarePhasesAlignedas the predicate; added §9mergeReparsedPhasescontract (strategy, last-write-wins on pre-existing duplicates, parser-side duplicate rejection, hands-off cursor, in-run-only wiring); added §10 resume-time alignment guard (fail-closed exit 2, three printed recovery paths, silent-corruption rationale for refusing to auto-heal); refreshed §11 glossary line numbers and addedarePhasesAligned,mergeReparsedPhases, and the resume-guard call site.build/README.md: added Resume-time state/plan desync (exit 2) subsection to Failure Handling describing the new fail-closed error, the three recovery options, and a back-reference to state-machine §10.Docs-state-machine enforcement test passes (37/37). No other docs needed updating — the fix is an internal orchestrator behavior change with no new CLI flags or skills.
Documentation coverage
arePhasesAlignedmergeReparsedPhasesCoverage: all shipped behavior has adequate documentation.
Tests
Full suite: 1305 pass / 2 skip / 1 pre-existing flake (4072 expects).
The 1 flake is in
build/orchestrator/__tests__/sub-agents.test.ts—runConfiguredRoleTask: timeout-fix integrationblock, timing-dependent. Pre-existing: passes when run in isolation, fails under concurrent suite load. Reproduces onmainwithout this branch's diff. Not in scope for this fix.Per /ship Step 5 Test Failure Ownership Triage: classified as pre-existing, skip per Option D (the same flake was already documented during the parent /investigate session).
Test plan
bun test build/orchestrator/__tests__/phases-added-merge.test.ts— 12 pass / 0 failbun test build/orchestrator/__tests__/— 1305 pass / 2 skip / 1 pre-existing flakebun test build/orchestrator/__tests__/docs-state-machine.test.ts— 37/37 pass🤖 Generated with Claude Code