Skip to content

fix: improve Codex CLI invocation and reviewer agent prompting#124

Merged
maxritter merged 1 commit intomainfrom
dev
Mar 31, 2026
Merged

fix: improve Codex CLI invocation and reviewer agent prompting#124
maxritter merged 1 commit intomainfrom
dev

Conversation

@maxritter
Copy link
Copy Markdown
Owner

Changes

Based on reading the official Codex plugin commands (adversarial-review.md, status.md, result.md):

Codex CLI Invocation Improvements

  • Use result --json after status --waitstatus returns job metadata, result returns the actual review content (verdict, findings, details). Previously we only used status --wait --json which doesn't include the full findings payload.
  • Add specific adversarial focus text — instead of generic "Review changes against spec", include the plan's goal/summary and direct Codex to challenge: wrong assumptions, missing edge cases, scope gaps, design failures under real-world conditions. Different focus per context (feature plan, bugfix plan, code changes).

Reviewer Agent Context

  • Add user_request to changes-review prompt — the changes-review agent was missing the original user request that invoked /spec. Without it, the agent can't verify if the implementation matches what the user actually asked for. The spec-review agent already had this field.

No changes needed

  • Agent token budgets and prompting are already well-structured (strict tool call limits, mandatory Write as final action, token discipline notes)
  • Companion path detection via glob is the right approach (no CLAUDE_PLUGIN_ROOT available outside plugin command context)

- Use 'result --json' after 'status --wait' to get full review findings
- Add specific adversarial focus text to Codex invocations (goal, assumptions, edge cases)
- Add user_request to changes-review agent prompt for requirement traceability
- Bugfix-plan Codex focus: root cause validity, regression risk, symptoms vs cause
@vercel
Copy link
Copy Markdown

vercel bot commented Mar 31, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Actions Updated (UTC)
pilot-shell Ignored Ignored Mar 31, 2026 11:19am

Request Review

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 31, 2026

Important

Review skipped

Review was skipped due to path filters

⛔ Files ignored due to path filters (6)
  • launcher/banner.py is excluded by !launcher/**
  • pilot/commands/spec-bugfix-plan.md is excluded by !pilot/**
  • pilot/commands/spec-plan.md is excluded by !pilot/**
  • pilot/commands/spec-verify.md is excluded by !pilot/**
  • pilot/scripts/mcp-server.cjs is excluded by !pilot/**
  • pilot/scripts/worker-service.cjs is excluded by !pilot/**

CodeRabbit blocks several paths by default. You can override this behavior by explicitly including those paths in the path filters. For example, including **/dist/** will override the default block on the dist directory, by removing the pattern from both the lists.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 6a079d4f-bc89-4c22-8f87-b01ea3373266

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch dev

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@claude
Copy link
Copy Markdown

claude bot commented Mar 31, 2026

Claude finished @maxritter's task in 1m 43s —— View job


Code Review — PR #124

  • Gather context and read changed files
  • Review pilot/commands/spec-plan.md
  • Review pilot/commands/spec-bugfix-plan.md
  • Review pilot/commands/spec-verify.md
  • Review pilot/scripts/mcp-server.cjs
  • Review pilot/scripts/worker-service.cjs
  • Post final review

Summary

This is a well-scoped, focused PR that correctly fixes two independent issues: the Codex CLI two-step result collection, and the missing user_request context in the changes-review agent. The rationale is sound and the changes are internally consistent across all three spec files.


Findings

pilot/commands/spec-bugfix-plan.md, spec-plan.md, spec-verify.md — Two-step Codex result collection

The split of status --waitresult --json is correct per the Codex CLI contract, and the richer adversarial focus text (wrong root cause, regression risk, scope gaps) will produce meaningfully better findings than the generic "Review changes against spec".

One gap: after result <jobId> --json there is no error handling if the result command itself fails (e.g. job data purged, network issue, non-zero exit). The status step has handlers for waitTimedOut, cancelled, and non-zero exit, but the subsequent result call is bare. A minimal handler — log the error and continue without Codex findings — would make the flow consistent.

Fix this →

pilot/commands/spec-verify.mduser_request added to changes-review agent

Correct fix. Without the original user request, the changes-review agent can only verify compliance against the plan file, not against what the user actually asked for. The updated instruction text — compliance (plan match + user request match) — accurately reflects the expanded scope.

pilot/scripts/mcp-server.cjs / worker-service.cjs

These are minified build artifacts. The visible change in mcp-server.cjs is a version bump "7.8.5""7.9.0". No functional issues visible in the diffable portions of the minified output.


Minor observations

  • In spec-bugfix-plan.md, step 4 ends mid-sentence ("→ fetch full result:") leading into step 5 as a code block. Functionally fine but slightly awkward to read — could be rendered as a nested bullet or inline note rather than a dangling sentence.
  • The placeholder text <plan summary/root cause> and <plan summary/goal> in the adversarial prompts correctly signals to the executing AI what to substitute. No issue, just confirm downstream consumers handle this template pattern consistently (they appear to).

Verdict

Approve with one suggestion: add error handling for the result command to match the robustness of the status error path. Everything else looks correct and well-reasoned.

@maxritter maxritter merged commit d0c8494 into main Mar 31, 2026
13 checks passed
@maxritter maxritter deleted the dev branch March 31, 2026 11:29
github-actions bot pushed a commit that referenced this pull request Mar 31, 2026
# [7.10.0](v7.9.0...v7.10.0) (2026-03-31)

### Bug Fixes

* improve Codex CLI invocation and reviewer agent prompting ([#124](#124)) ([d0c8494](d0c8494))
* security audit hardening across console, launcher, and installer ([822c513](822c513))

### Features

* Codex adversarial reviewers, spec sharing, and Console improvements ([#123](#123)) ([55055d9](55055d9))
@github-actions
Copy link
Copy Markdown
Contributor

🎉 This PR is included in version 7.10.0 🎉

The release is available on GitHub release

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant