[codex] Add outcome fallback runtime contracts#71038
[codex] Add outcome fallback runtime contracts#71038100yenadmin wants to merge 5 commits intoopenclaw:mainfrom
Conversation
Greptile SummaryThis PR adds three test-only files that lock the outcome/fallback runtime contract: shared fixture helpers, Pi fallback classifier tests, and Codex app-server adapter tests. No production code is changed. The tests accurately reflect the classifier and projector behavior they target. Confidence Score: 5/5Safe to merge — no production code is changed and all findings are P2. The single finding is a P2 style/maintainability issue in a test helper: the meta-merge ordering in createContractRunResult makes the durationMs default dead code. All current tests pass correctly because every call site explicitly provides durationMs. No P0/P1 issues were found. test/helpers/agents/outcome-fallback-runtime-contract.ts — minor meta-merge ordering issue in the factory helper. Prompt To Fix All With AIThis is a comment left during a code review.
Path: test/helpers/agents/outcome-fallback-runtime-contract.ts
Line: 17-29
Comment:
The inner `...overrides.meta` spread is overwritten by the outer `...overrides` later in the same object literal, so the `durationMs: 1` default inside it is dead code when `meta` is passed. The current tests happen to work because every caller explicitly provides `durationMs: 1` in the override, but the intended deep-merge of meta defaults never actually fires. Destructuring the override avoids the ordering conflict and makes the default meaningful.
```suggestion
const { meta: metaOverride, ...restOverrides } = overrides;
return {
payloads: [],
didSendViaMessagingTool: false,
messagingToolSentTexts: [],
messagingToolSentMediaUrls: [],
messagingToolSentTargets: [],
successfulCronAdds: 0,
...restOverrides,
meta: {
durationMs: 1,
...metaOverride,
},
};
```
How can I resolve this? If you propose a fix, please make it concise.Reviews (1): Last reviewed commit: "test: add outcome fallback runtime contr..." | Re-trigger Greptile |
There was a problem hiding this comment.
Pull request overview
Adds contract tests that lock down outcome classification and fallback behavior for GPT-5 terminal results across the Pi fallback classifier and the Codex app-server adapter (per RFC #71004 / #71004).
Changes:
- Introduces a shared test fixture for constructing fallback configs and embedded run results for outcome/fallback contract coverage.
- Adds Pi-side tests asserting harness classifications (
empty,reasoning-only,planning-only) map toformatfallback codes and advance the fallback chain when appropriate. - Adds Codex app-server projector tests asserting raw terminal state (
emptyturn, exactNO_REPLY, tool side-effect telemetry) is preserved for OpenClaw-owned classification.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| test/helpers/agents/outcome-fallback-runtime-contract.ts | Adds shared constants + helpers for outcome/fallback contract tests. |
| src/agents/outcome-fallback-runtime-contract.test.ts | Adds Pi classifier + runWithModelFallback contract assertions. |
| extensions/codex/src/app-server/outcome-fallback-runtime-contract.test.ts | Adds Codex adapter contract assertions about preserved terminal state + telemetry. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 09b180f27b
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Summary
Adds the outcome/fallback contract rung from RFC #71004. This is test-only: it locks how GPT-5 terminal outcomes are classified for fallback and how the Codex app-server adapter preserves terminal state for OpenClaw-owned classification.
No production runtime behavior changes in this PR.
Files Changed And Why
test/helpers/agents/outcome-fallback-runtime-contract.tssrc/agents/outcome-fallback-runtime-contract.test.tsextensions/codex/src/app-server/outcome-fallback-runtime-contract.test.tsContract Matrix
empty,reasoning-only, andplanning-onlyclassifications map toformatfallback codes.NO_REPLY, visible replies, aborts, block replies, and tool side effects do not trigger fallback.NO_REPLYremains an intentional silent terminal reply.Important Boundary
This PR does not claim the current Codex harness already supplies
classify(). It proves:How This Helps The RuntimePlan Work
The plan should own outcome classification and fallback semantics once. These tests prevent the GPT-5.4 stall class from reappearing when Pi/Codex behavior is migrated into that shared policy.
Reviewer Notes
classify()implementation in this PR.NO_REPLY/side-effect non-fallback behavior.Verification
node scripts/run-vitest.mjs run --config test/vitest/vitest.agents.config.ts src/agents/outcome-fallback-runtime-contract.test.ts— 16 passednode scripts/run-vitest.mjs run --config test/vitest/vitest.extensions.config.ts extensions/codex/src/app-server/outcome-fallback-runtime-contract.test.ts— 10 passed./node_modules/.bin/oxlint --tsconfig tsconfig.oxlint.core.json test/helpers/agents/outcome-fallback-runtime-contract.ts src/agents/outcome-fallback-runtime-contract.test.ts extensions/codex/src/app-server/outcome-fallback-runtime-contract.test.tsgit diff --check -- test/helpers/agents/outcome-fallback-runtime-contract.ts src/agents/outcome-fallback-runtime-contract.test.ts extensions/codex/src/app-server/outcome-fallback-runtime-contract.test.tsRefs #71004
Follows #71009
Follows #71029