Skip post-codex claude review when no findings#282
Conversation
The post-codex claude review loop exists to catch regressions from fixes applied during the external review phase. When the external review found no actionable issues (CodexDone on first iteration, empty output, or tool disabled), no fixes were made and there is nothing to regress — running the review is wasted time and tokens. runExternalReviewLoop now returns (bool, error) where the bool indicates whether findings were found. runCodexAndPostReview uses this to skip the post-codex review when unnecessary.
Deploying ralphex with
|
| Latest commit: |
b04cf62
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://7c340570.ralphex.pages.dev |
| Branch Preview URL: | https://skip-post-codex-review-no-fi.ralphex.pages.dev |
There was a problem hiding this comment.
Pull request overview
Updates the processor runner’s external-review pipeline to avoid running the post-codex Claude review when the external review produced no actionable fixes, reducing unnecessary latency and token usage.
Changes:
- Changed the external review loop APIs (
runCodexLoop/runExternalReviewLoop) to return ahadFindingsboolean. - Skips the post-codex Claude review (and proceeds directly to finalize) when
hadFindingsis false. - Updates and extends runner tests to reflect the new iteration behavior and skip conditions.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| pkg/processor/runner.go | Propagates hadFindings from external review loops and conditionally skips post-codex review. |
| pkg/processor/runner_test.go | Adjusts existing tests for multi-iteration codex loops and adds coverage for the new skip behavior. |
Comments suppressed due to low confidence (2)
pkg/processor/runner.go:780
- When the external review tool returns empty output, the loop logs "%s review returned no output, skipping..." and then
breaks. That break falls through to the end-of-loop log "max %s iterations reached...", which is incorrect/misleading in this path. Consider returning immediately on empty output (or tracking an explicit early-exit reason) so the max-iterations message is only logged when the iteration budget is actually exhausted.
if reviewResult.Output == "" {
r.log.Print("%s review returned no output, skipping...", cfg.name)
break
}
pkg/processor/runner.go:815
- In the session-timeout path the loop
continues before settinghadFindings = true. Because Claude timeouts clear the error/signal, a timed-out eval could still have made working-tree changes, yethadFindingsmay remain false and cause the post-codex review to be skipped. Consider conservatively markinghadFindingswhen a Claude eval times out (or checking git state before/after the timeout) so the safety review isn't skipped when changes may have been applied during a partial session.
// on session timeout, skip response capture and stalemate detection; the session was killed
// before it could finish, so partial output can't be trusted as previous context and
// "no changes" doesn't mean "nothing to fix"
if r.lastSessionTimedOut {
r.log.Print("claude eval session timed out, retrying %s iteration...", cfg.name)
continue
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // skip post-codex claude review when external review found nothing on the first pass. | ||
| // the purpose of this review is to catch regressions from fixes applied during the external | ||
| // review loop — if no findings were reported, no fixes were made and there's nothing to regress. | ||
| if !hadFindings { | ||
| r.log.Print("external review found no issues, skipping post-codex claude review") | ||
| return r.runFinalize(ctx) |
There was a problem hiding this comment.
The skip-path log/comment are misleading when the external review tool is disabled (tool=="none" / CodexEnabled=false): in that case no external review ran, but this still logs "external review found no issues". Consider adjusting the condition or message to distinguish "external review disabled" vs "ran and had no actionable findings" (e.g., return a separate "ranExternalReview" flag or change the message to avoid implying it executed).
- fix misleading comment on runExternalReviewLoop return value to describe actual mechanism (CodexDone signal check) instead of approximation - rename test subcase to clarify intent
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
Comments suppressed due to low confidence (1)
pkg/processor/runner.go:780
- When
reviewResult.Output == ""the loopbreaks, but then the function falls through tor.log.Print("max %s iterations reached..."), which is misleading (the loop didn’t hit max iterations; it exited due to empty output). Since empty output is treated as “no findings”, it would be clearer to return immediately here (likely withhadFindings=false) and avoid printing the max-iterations message in this path.
if reviewResult.Output == "" {
r.log.Print("%s review returned no output, skipping...", cfg.name)
break
}
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // codex finds something, but claude eval says "no actionable findings" → CodexDone on first iter | ||
| claude := newMockExecutor([]executor.Result{ | ||
| {Output: "dismissed all findings", Signal: status.CodexDone}, | ||
| }) |
There was a problem hiding this comment.
This test models “codex finds something but Claude dismisses all findings” by having Claude immediately emit CODEX_REVIEW_DONE. However, the default codex evaluation prompt instructs Claude not to emit CODEX_REVIEW_DONE when dismissing findings as invalid (it should provide explanations and let the external tool re-run). As written, this test may not reflect a realistic production flow and can give false confidence about the skip behavior. Consider adjusting the scenario to match the default prompt behavior (dismissal → another external iteration → eventual no-findings signal) and asserting the intended skip logic against that flow.
| @@ -734,15 +745,16 @@ func (r *Runner) runExternalReviewLoop(ctx context.Context, cfg externalReviewCo | |||
| var claudeResponse string // first iteration has no prior response | |||
| var unchangedRounds int // consecutive iterations with no commits (for stalemate detection) | |||
| firstCompleted := false // tracks if any successful eval completed; controls diff scope for external tool | |||
| hadFindings := false // tracks if external review found any issues requiring fixes | |||
There was a problem hiding this comment.
hadFindings is set to true whenever Claude does not signal CODEX_REVIEW_DONE, but the default codex evaluation prompt explicitly instructs Claude to not emit CODEX_REVIEW_DONE when it dismisses findings as invalid (and also when it fixes issues). That means hadFindings currently conflates “fixes were applied” with “findings were dismissed / need re-evaluation”, and post-codex review may still run even when no code changes were made—undercutting the goal of skipping unnecessary post-review runs. Consider tracking whether the external loop actually produced code changes (e.g., compare HEAD and/or diff fingerprint before/after eval rounds, and only run post-codex review if changes occurred), or rename/redefine the boolean to match what it truly represents and adjust the skip condition accordingly.
… findings - rename claude_model -> task_model and --claude-model -> --task-model for symmetry with review_model/--review-model - pass task model to plan mode (was missing) - strip duplicate --model flags from claude_args to prevent CLI errors - fix typo in pkg/config/defaults/config: review_model section had `# claude_model =` example - add processor test for ReviewClaude executor separation (nil fallback + distinct review executor) - add ReviewModel merge test, normalize --flag=value form in docs - update docs (README, CLAUDE.md, llms.txt, custom-providers.md) - update review-claude tests for post-codex-skip behavior added in #282 picks up unmerged follow-up commits from feat/claude-model-config branch
… findings (#284) * feat: rename claude_model to task_model, address unmerged #272 review findings - rename claude_model -> task_model and --claude-model -> --task-model for symmetry with review_model/--review-model - pass task model to plan mode (was missing) - strip duplicate --model flags from claude_args to prevent CLI errors - fix typo in pkg/config/defaults/config: review_model section had `# claude_model =` example - add processor test for ReviewClaude executor separation (nil fallback + distinct review executor) - add ReviewModel merge test, normalize --flag=value form in docs - update docs (README, CLAUDE.md, llms.txt, custom-providers.md) - update review-claude tests for post-codex-skip behavior added in #282 picks up unmerged follow-up commits from feat/claude-model-config branch * fix: address copilot review on PR 284 - stripFlag now handles --flag=value and bare --flag forms (was only handling --flag value space-form), prevents duplicate/orphaned --model in claude_args - add stripFlag tests for equals form, bare flag, repeated occurrences, prefix-only false-match - doc wording: clarify that task_model/review_model are passed as --model to the configured claude_command and may be ignored or implemented by custom wrappers
The post-codex claude review loop catches regressions from fixes applied during external review. When external review found no actionable issues, no fixes were made — running this review wastes time and tokens.
Changes:
runExternalReviewLoopreturns(bool, error)— bool tracks whether findings were foundhadFindingsbecomes true only when claude eval does NOT signal CodexDone (meaning fixes were applied)runCodexAndPostReviewskips post-codex review whenhadFindingsis falseScenarios that skip post-codex review:
noneorCodexEnabled=false)Scenarios that still run post-codex review: