|
| 1 | +--- |
| 2 | +name: autoreview |
| 3 | +description: "Autoreview closeout: local dirty changes, PR branch vs main, parallel tests." |
| 4 | +--- |
| 5 | + |
| 6 | +# Autoreview |
| 7 | + |
| 8 | +Run Codex's built-in code review as a closeout check. This is code review (`codex review`), not Guardian `auto_review` approval routing. |
| 9 | + |
| 10 | +Codex native review mode performs best and is recommended. Non-Codex reviewers are fallback/second-opinion paths that receive a generated diff prompt, not the full Codex review-mode runtime. |
| 11 | + |
| 12 | +Use when: |
| 13 | +- user asks for Codex review / autoreview / second-model review |
| 14 | +- after non-trivial code edits, before final/commit/ship |
| 15 | +- reviewing a local branch or PR branch after fixes |
| 16 | + |
| 17 | +## Contract |
| 18 | + |
| 19 | +- Treat review output as advisory. Never blindly apply it. |
| 20 | +- Verify every finding by reading the real code path and adjacent files. |
| 21 | +- Read dependency docs/source/types when the finding depends on external behavior. |
| 22 | +- Reject unrealistic edge cases, speculative risks, broad rewrites, and fixes that over-complicate the codebase. |
| 23 | +- Prefer small fixes at the right ownership boundary; no refactor unless it clearly improves the bug class. |
| 24 | +- Keep going until the selected review path returns no accepted/actionable findings. |
| 25 | +- If a review-triggered fix changes code, rerun focused tests and rerun the review helper. |
| 26 | +- Default to Codex review. If Codex is unavailable or exits with an error, the helper falls back to the first configured CLI from `claude -p`, `pi -p`, `opencode run`, `droid exec`, or `copilot`. Prefer Codex for final closeout because it uses native review mode; non-Codex reviewers use a Codex-inspired generated diff prompt. The helper runs nested Codex review in yolo/full-access mode by default; use `--no-yolo` only when intentionally testing sandbox behavior. |
| 27 | +- Stop as soon as the review command/helper exits 0 with no accepted/actionable findings. Do not run an extra direct `codex review` just to get a nicer "clean" line, a second opinion, or clearer closeout wording. |
| 28 | +- Treat the helper's successful exit plus absence of actionable findings as the clean review result, even if the underlying Codex CLI output is terse. |
| 29 | +- If rejecting a finding as intentional/not worth fixing, add a brief inline code comment only when it explains a real invariant or ownership decision that future reviewers should know. |
| 30 | +- Do not push just to review. Push only when the user requested push/ship/PR update. |
| 31 | +- For OpenClaw maintainers, keep autoreview validation Crabbox/Testbox-aware when maintainer validation mode is enabled (`OPENCLAW_TESTBOX=1` or `AUTOREVIEW_OPENCLAW_MAINTAINER_VALIDATION=1`). A review pass may inspect files and run cheap non-Node probes, but it must not start local `pnpm`, Vitest, `tsgo`, `npm test`, or `node scripts/run-vitest.mjs` from a Codex/worktree review unless the operator explicitly requested local proof. For runtime proof, use existing evidence or route through Crabbox/Testbox and report the id. Do not apply this rule to ordinary contributors who do not have maintainer Testbox access. |
| 32 | + |
| 33 | +## Pick Target |
| 34 | + |
| 35 | +Dirty local work: |
| 36 | + |
| 37 | +```bash |
| 38 | +codex review --uncommitted |
| 39 | +``` |
| 40 | + |
| 41 | +Use this only when the patch is actually unstaged/staged/untracked in the |
| 42 | +current checkout. For committed, pushed, or PR work, point Codex at the commit |
| 43 | +or branch diff instead; do not force `--mode local` / `--uncommitted` just |
| 44 | +because the helper docs mention dirty work first. A clean `--uncommitted` review |
| 45 | +only proves there is no local patch. |
| 46 | + |
| 47 | +Branch/PR work: |
| 48 | + |
| 49 | +```bash |
| 50 | +git fetch origin |
| 51 | +codex review --base origin/main |
| 52 | +``` |
| 53 | + |
| 54 | +Do not pass any prompt with `--base`. Some Codex CLI versions reject both inline |
| 55 | +and stdin prompt forms, including the helper's `codex review --base <ref> -`, |
| 56 | +with `--base <BRANCH> cannot be used with [PROMPT]`. If the helper hits this |
| 57 | +error, run plain `codex review --base <ref>` and report that the helper prompt |
| 58 | +injection was skipped. |
| 59 | + |
| 60 | +If an open PR exists, use its actual base: |
| 61 | + |
| 62 | +```bash |
| 63 | +base=$(gh pr view --json baseRefName --jq .baseRefName) |
| 64 | +codex review --base "origin/$base" |
| 65 | +``` |
| 66 | + |
| 67 | +Committed single change: |
| 68 | + |
| 69 | +```bash |
| 70 | +codex review --commit HEAD |
| 71 | +``` |
| 72 | + |
| 73 | +or with the helper: |
| 74 | + |
| 75 | +```bash |
| 76 | +.agents/skills/autoreview/scripts/autoreview --mode commit --commit HEAD |
| 77 | +``` |
| 78 | + |
| 79 | +Use commit review for already-landed or already-pushed work on `main`. Reviewing |
| 80 | +clean `main` against `origin/main` is usually an empty diff after push. For a |
| 81 | +small stack, review each commit explicitly or review the branch before merging |
| 82 | +with `--base`. |
| 83 | + |
| 84 | +## Parallel Closeout |
| 85 | + |
| 86 | +Format first if formatting can change line locations. Then it is OK to run tests and review in parallel: |
| 87 | + |
| 88 | +```bash |
| 89 | +.agents/skills/autoreview/scripts/autoreview --parallel-tests "<focused test command>" |
| 90 | +``` |
| 91 | + |
| 92 | +Tradeoff: tests may force code changes that stale the review. If tests or review lead to code edits, rerun the affected tests and rerun review until no accepted/actionable findings remain. Once that rerun exits cleanly, stop; do not spend another long review cycle on redundant confirmation. |
| 93 | + |
| 94 | +## Context Efficiency |
| 95 | + |
| 96 | +Codex review is usually noisy. Default to a subagent filter when subagents are available. Ask it to run the review and return only: |
| 97 | +- actionable findings it accepts |
| 98 | +- findings it rejects, with one-line reason |
| 99 | +- exact files/tests to rerun |
| 100 | + |
| 101 | +Run inline only for tiny changes or when subagents are unavailable. |
| 102 | + |
| 103 | +## Helper |
| 104 | + |
| 105 | +Bundled helper: |
| 106 | + |
| 107 | +```bash |
| 108 | +.agents/skills/autoreview/scripts/autoreview --help |
| 109 | +``` |
| 110 | + |
| 111 | +The helper: |
| 112 | +- chooses dirty `--uncommitted` first |
| 113 | +- otherwise uses current PR base if `gh pr view` works |
| 114 | +- otherwise uses `origin/main` for non-main branches |
| 115 | +- auto-runs `PNPM_CONFIG_PM_ON_FAIL=ignore PNPM_CONFIG_VERIFY_DEPS_BEFORE_RUN=false PNPM_CONFIG_OFFLINE=true pnpm run check` in parallel when a repo has `package.json`, `pnpm-lock.yaml`, `node_modules`, and a `check` script; disable with `AUTOREVIEW_AUTO_TESTS=0` |
| 116 | +- use `--mode commit --commit <ref>` for already-committed work, especially clean `main` after landing |
| 117 | +- should be left in `--mode auto` or forced to `--mode branch` for PR/branch work; do not force `--mode local` after committing |
| 118 | +- supports `--reviewer codex|claude|pi|opencode|droid|copilot|auto`; `auto` means Codex first |
| 119 | +- supports `--fallback-reviewer auto|claude|pi|opencode|droid|copilot|none`; default is configured CLI fallback |
| 120 | +- falls back only when Codex is unavailable or exits nonzero, not when Codex reports findings |
| 121 | +- writes only to stdout unless `--output` or `AUTOREVIEW_OUTPUT` is set |
| 122 | +- supports `--dry-run`, `--parallel-tests`, and commit refs |
| 123 | +- runs nested review with `--dangerously-bypass-approvals-and-sandbox --sandbox danger-full-access` by default |
| 124 | +- injects maintainer-only OpenClaw validation policy into native Codex review when `OPENCLAW_TESTBOX=1` or `AUTOREVIEW_OPENCLAW_MAINTAINER_VALIDATION=1`, so local memory-heavy Node/Vitest checks are avoided in favor of Crabbox/Testbox proof |
| 125 | +- branch mode may fail on Codex CLI versions that reject `--base` plus the helper's stdin prompt; on that exact parser error, rerun plain `codex review --base <ref>` instead of falling back to a non-Codex reviewer |
| 126 | +- keeps accepting `--full-access`; use `--no-yolo` or `AUTOREVIEW_YOLO=0` to opt out |
| 127 | +- still accepts legacy `CODEX_REVIEW_*` env vars when the matching `AUTOREVIEW_*` var is unset |
| 128 | +- prints `autoreview clean: no accepted/actionable findings reported` when the selected review command exits 0 |
| 129 | + |
| 130 | +## Final Report |
| 131 | + |
| 132 | +Include: |
| 133 | +- review command used |
| 134 | +- tests/proof run |
| 135 | +- findings accepted/rejected, briefly why |
| 136 | +- the clean review result from the final helper/review run, or why a remaining finding was consciously rejected |
| 137 | + |
| 138 | +Do not run another Codex review solely to improve the final report wording. If the final helper run exited 0 and produced no accepted/actionable findings, report that exact run as clean. |
| 139 | + |
| 140 | +## PR / CI Closeout |
| 141 | + |
| 142 | +- Prefer direct run/job APIs after CI starts: `gh run view <run-id> --json jobs`; use PR rollup only for final mergeability. |
| 143 | +- After rebase, compare `origin/main..HEAD`; drop CI-fix commits already upstream before pushing. |
| 144 | +- For prompt snapshot CI failures, prove/generate with Linux Node 24 before rerunning the failed job. |
| 145 | +- Update PR body once near the final head unless proof labels are missing or stale enough to block CI. |
0 commit comments