agents: GPT-5.4 runtime completion rollup by 100yenadmin · Pull Request #65219 · openclaw/openclaw

100yenadmin · 2026-04-12T06:57:53Z

Summary

Auto-activates the strict-agentic execution contract for unconfigured GPT-5 openai / openai-codex runs so the behavioral improvements from the parity program work out of the box. Previously users had to manually set agents.defaults.embeddedPi.executionContract: "strict-agentic" — now it just works.

Part of #64227. See the umbrella for how this fits with #65224 (test proof) and #65257 (behavioral fix).

What changed

src/agents/execution-contract.ts

resolveEffectiveExecutionContract() auto-activates strict-agentic when the provider is openai or openai-codex and the model matches gpt-5*. Explicit "default" opt-out is honored.
stripProviderPrefix() handles prefixed model IDs (openai/gpt-5.4, openai:gpt-5.4).
STRICT_AGENTIC_MODEL_ID_PATTERN covers gpt-5, gpt-5.4, gpt-5o, gpt-5-preview, gpt-5-turbo, and date-suffixed variants like gpt-5-2025-03.
Explicit "blocked" liveness state at the strict-agentic blocked exit.

How this PR relates to the others

agents: GPT-5.4 parity proof rollup #65224 tests the activated contract through the 10-scenario parity pack
agents: strengthen GPT-5.4 execution bias and close the one-action-then-narrative loophole #65257 builds on the activated contract to fix the behavioral gaps (prompt + detection + continuation loop)
This PR is the foundation — without auto-activation, the other two don't fire for unconfigured GPT-5 users

Review status

Hardening pass complete on current head.
Unresolved review-thread count: 0.
Targeted runtime validation remains green on the latest commit.

100yenadmin · 2026-04-12T06:58:09Z

Current verification on this rollup head:

branch-owned runtime suite passed: 44/44
the merged runtime+proof integration branch passed the new instruction-followthrough-repo-contract scenario on the real QA harness
that scenario passing without any extra runtime patch is the important signal here: the H behavior is enough to satisfy the original AGENT.md / SOUL.md followthrough proof when paired with the proof rollup

This PR supersedes #64679 and is intended to be the single runtime closeout PR for the GPT-5.4 work.

Copilot

Pull request overview

This PR rolls up the runtime-side completion fixes for the GPT-5.4 / Codex parity program by unifying GPT‑5 family detection, auto-activating the strict-agentic execution contract for supported unconfigured runs, and ensuring strict-agentic blocked exits emit explicit liveness/replay metadata.

Changes:

Introduces a shared GPT‑5 family + provider matcher and uses it for both strict-agentic activation and the planning-only retry guard.
Adds explicit livenessState / replayInvalid terminal metadata at the strict-agentic blocked exit path.
Expands/updates tests to cover auto-activation, opt-out behavior, prefixed model IDs, and blocked-exit metadata.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
`src/agents/pi-embedded-runner/run/incomplete-turn.ts`	Uses the shared strict-agentic supported provider/model matcher for planning-only retry gating.
`src/agents/pi-embedded-runner/run.ts`	Threads `livenessState` + `replayInvalid` through the strict-agentic blocked terminal return and sets terminal lifecycle meta.
`src/agents/pi-embedded-runner/run.incomplete-turn.test.ts`	Adds regression tests for blocked-exit metadata, auto-activation behavior, opt-out behavior, and broadened model id handling.
`src/agents/openclaw-tools.update-plan.test.ts`	Updates tests to assert update_plan auto-enables for unconfigured GPT‑5 OpenAI runs and respects explicit opt-out.
`src/agents/execution-contract.ts`	Adds `resolveEffectiveExecutionContract()` and shared matcher logic, including support for `provider/` and `provider:` prefixed model IDs.
`src/agents/execution-contract.test.ts`	New unit tests for effective contract resolution and supported-lane detection.

greptile-apps · 2026-04-12T07:02:37Z

Greptile Summary

This PR introduces auto-activation of the strict-agentic execution contract for unconfigured GPT-5/openai runs, adds explicit replayInvalid + livenessState: "blocked" metadata to the strict-agentic blocked exit, and aligns the planning-only retry guard with a shared isStrictAgenticSupportedProviderModel helper so the activation check and the retry-guard check can't silently drift apart. The core logic — provider prefix stripping, the GPT-5-family regex, opt-out via executionContract: "default", and the 2-retry/blocked-exit vs. 1-retry/fall-through split — is consistent and well-exercised by the accompanying tests.

Confidence Score: 5/5

Safe to merge; all findings are P2 style observations with no correctness impact.

Auto-activation logic, opt-out path, blocked-exit metadata, and regex matching are all correct and covered by targeted tests. No P0/P1 issues found.

No files require special attention.

Prompt To Fix All With AI

This is a comment left during a code review.
Path: src/agents/execution-contract.ts
Line: 35

Comment:
**Regex character class: `\-` is correct but non-idiomatic**

`[.\-o]` works correctly — the backslash-escaped hyphen avoids being interpreted as the range `.-o` (ASCII 46–111). The conventional way to express a literal hyphen in a character class is to place it first or last: `[-o.]` or `[.o-]`. Either avoids the escape without changing the semantics and is less likely to confuse future readers or trigger a linter's `no-useless-escape` rule.

```suggestion
const STRICT_AGENTIC_MODEL_ID_PATTERN = /^gpt-5(?:[.o-]|$)/i;
```

How can I resolve this? If you propose a fix, please make it concise.

_{Reviews (1): Last reviewed commit: "Merge branch 'fix/strict-agentic-default..." | Re-trigger Greptile}

Copilot

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated no new comments.

Copilot

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated no new comments.

…liveness Closes two hard blockers on the GPT-5.4 parity completion gate: 1) Criterion 1 (no stalls after planning) is universal, but the pre-existing strict-agentic execution contract was opt-in only. Out-of-the-box GPT-5 openai / openai-codex users who never set `agents.defaults.embeddedPi.executionContract` still got only 1 planning-only retry and then fell through to the normal completion path with the plan-only text, i.e. they still stalled. Introduce `resolveEffectiveExecutionContract(...)` in src/agents/execution-contract.ts. Behavior: - supported provider/model (openai or openai-codex + gpt-5-family) AND explicit "strict-agentic" or unspecified → "strict-agentic" - supported provider/model AND explicit "default" → "default" (opt-out) - unsupported provider/model → "default" regardless of explicit value `isStrictAgenticExecutionContractActive` now delegates to the effective resolver so the 2-retry + blocked-state treatment applies by default to every GPT-5 openai/codex run. Explicit opt-out still works for users who intentionally want the pre-parity-program behavior. 2) Criterion 4 (replay/liveness failures are explicit, not silent disappearance) is violated by the strict-agentic blocked exit itself. Every other terminal return path in src/agents/pi-embedded-runner/run.ts sets `replayInvalid` + `livenessState` via `setTerminalLifecycleMeta`, but the strict-agentic exit at run.ts:1615 falls through without them. Add explicit `livenessState: "abandoned"` + `replayInvalid` (via the shared `resolveReplayInvalidForAttempt` helper) to that exit, plus a `setTerminalLifecycleMeta` call so downstream observers (lifecycle log, ACP bridge, telemetry) see the same explicit terminal state they see on every other exit branch. Regressions added: - `auto-enables update_plan for unconfigured GPT-5 openai runs` - `respects explicit default contract opt-out on GPT-5 runs` - `does not auto-enable update_plan for non-openai providers even when unconfigured` - `emits explicit replayInvalid + abandoned liveness state at the strict-agentic blocked exit` - `auto-activates strict-agentic for unconfigured GPT-5 openai runs and surfaces the blocked state` - `respects explicit default contract opt-out on GPT-5 openai runs` Local validation: - pnpm test src/agents/openclaw-tools.update-plan.test.ts src/agents/pi-embedded-runner/run.incomplete-turn.test.ts src/agents/pi-embedded-runner.buildembeddedsandboxinfo.test.ts src/agents/system-prompt.test.ts src/agents/openclaw-tools.sessions.test.ts src/agents/pi-embedded-runner/run.overflow-compaction.test.ts 122/122 passing. Refs openclaw#64227

Triages all three loop-6 review comments on PR openclaw#64679: 1. Copilot: 'The strict-agentic blocked exit returns an error payload (isError: true) but sets livenessState to "abandoned". Elsewhere in the runner/lifecycle flow, error terminal states are treated as "blocked".' Verified: every other hardcoded error terminal branch in run.ts (role ordering at 1152, image size at 1206, schema error at 1244, compaction timeout at 1128, aborted-with-no-payloads at 606) uses livenessState: "blocked". Match that convention at the strict-agentic blocked exit at 1634. Updated the 'emits explicit replayInvalid + abandoned liveness state' regression test to assert the new "blocked" value and renamed the assertion commentary. 2. Copilot: 'The JSDoc for resolveEffectiveExecutionContract says explicit "strict-agentic" in config always resolves to "strict-agentic", but the implementation collapses to "default" whenever the provider/mode is unsupported.' Rewrite the JSDoc to explicitly document the unsupported-provider collapse as the lead case (strict-agentic is a GPT-5-family openai/openai-codex-only runtime contract) before listing the supported-lane behavior matrix. No code change; this is a docstring-only clarification. 3. Greptile P2: 'Non-preferred Anthropic model constant. CLAUDE.md says to prefer sonnet-4.6 for Anthropic test constants.' Swap claude-opus-4-6 → claude-sonnet-4-6 in the two update_plan gating fixtures that assert non-openai providers don't auto-enable the planning tool. Behavior unchanged; model constant now matches repo testing guidance. Local validation: - pnpm test src/agents/openclaw-tools.update-plan.test.ts src/agents/pi-embedded-runner/run.incomplete-turn.test.ts 29/29 passing. Refs openclaw#64227

… blocked state Addresses loop-7 Copilot finding on PR openclaw#64679: loop 6 changed the assertion to livenessState === 'blocked' to match the rest of the hard-error terminal branches in run.ts, but the test title still said 'abandoned liveness state', which made failures and test output misleading. Rename the test title to match the asserted value. No code change beyond the it(...) title. Validation: pnpm test src/agents/pi-embedded-runner/run.incomplete-turn.test.ts (19/19 pass). Refs openclaw#64227

…ariant GPT-5 model ids

* agents: auto-activate strict-agentic for GPT-5 and emit blocked-exit liveness Closes two hard blockers on the GPT-5.4 parity completion gate: 1) Criterion 1 (no stalls after planning) is universal, but the pre-existing strict-agentic execution contract was opt-in only. Out-of-the-box GPT-5 openai / openai-codex users who never set `agents.defaults.embeddedPi.executionContract` still got only 1 planning-only retry and then fell through to the normal completion path with the plan-only text, i.e. they still stalled. Introduce `resolveEffectiveExecutionContract(...)` in src/agents/execution-contract.ts. Behavior: - supported provider/model (openai or openai-codex + gpt-5-family) AND explicit "strict-agentic" or unspecified → "strict-agentic" - supported provider/model AND explicit "default" → "default" (opt-out) - unsupported provider/model → "default" regardless of explicit value `isStrictAgenticExecutionContractActive` now delegates to the effective resolver so the 2-retry + blocked-state treatment applies by default to every GPT-5 openai/codex run. Explicit opt-out still works for users who intentionally want the pre-parity-program behavior. 2) Criterion 4 (replay/liveness failures are explicit, not silent disappearance) is violated by the strict-agentic blocked exit itself. Every other terminal return path in src/agents/pi-embedded-runner/run.ts sets `replayInvalid` + `livenessState` via `setTerminalLifecycleMeta`, but the strict-agentic exit at run.ts:1615 falls through without them. Add explicit `livenessState: "abandoned"` + `replayInvalid` (via the shared `resolveReplayInvalidForAttempt` helper) to that exit, plus a `setTerminalLifecycleMeta` call so downstream observers (lifecycle log, ACP bridge, telemetry) see the same explicit terminal state they see on every other exit branch. Regressions added: - `auto-enables update_plan for unconfigured GPT-5 openai runs` - `respects explicit default contract opt-out on GPT-5 runs` - `does not auto-enable update_plan for non-openai providers even when unconfigured` - `emits explicit replayInvalid + abandoned liveness state at the strict-agentic blocked exit` - `auto-activates strict-agentic for unconfigured GPT-5 openai runs and surfaces the blocked state` - `respects explicit default contract opt-out on GPT-5 openai runs` Local validation: - pnpm test src/agents/openclaw-tools.update-plan.test.ts src/agents/pi-embedded-runner/run.incomplete-turn.test.ts src/agents/pi-embedded-runner.buildembeddedsandboxinfo.test.ts src/agents/system-prompt.test.ts src/agents/openclaw-tools.sessions.test.ts src/agents/pi-embedded-runner/run.overflow-compaction.test.ts 122/122 passing. Refs openclaw#64227 * agents: address loop-6 review comments on strict-agentic contract Triages all three loop-6 review comments on PR openclaw#64679: 1. Copilot: 'The strict-agentic blocked exit returns an error payload (isError: true) but sets livenessState to "abandoned". Elsewhere in the runner/lifecycle flow, error terminal states are treated as "blocked".' Verified: every other hardcoded error terminal branch in run.ts (role ordering at 1152, image size at 1206, schema error at 1244, compaction timeout at 1128, aborted-with-no-payloads at 606) uses livenessState: "blocked". Match that convention at the strict-agentic blocked exit at 1634. Updated the 'emits explicit replayInvalid + abandoned liveness state' regression test to assert the new "blocked" value and renamed the assertion commentary. 2. Copilot: 'The JSDoc for resolveEffectiveExecutionContract says explicit "strict-agentic" in config always resolves to "strict-agentic", but the implementation collapses to "default" whenever the provider/mode is unsupported.' Rewrite the JSDoc to explicitly document the unsupported-provider collapse as the lead case (strict-agentic is a GPT-5-family openai/openai-codex-only runtime contract) before listing the supported-lane behavior matrix. No code change; this is a docstring-only clarification. 3. Greptile P2: 'Non-preferred Anthropic model constant. CLAUDE.md says to prefer sonnet-4.6 for Anthropic test constants.' Swap claude-opus-4-6 → claude-sonnet-4-6 in the two update_plan gating fixtures that assert non-openai providers don't auto-enable the planning tool. Behavior unchanged; model constant now matches repo testing guidance. Local validation: - pnpm test src/agents/openclaw-tools.update-plan.test.ts src/agents/pi-embedded-runner/run.incomplete-turn.test.ts 29/29 passing. Refs openclaw#64227 * test: rename strict-agentic blocked-exit liveness regression to match blocked state Addresses loop-7 Copilot finding on PR openclaw#64679: loop 6 changed the assertion to livenessState === 'blocked' to match the rest of the hard-error terminal branches in run.ts, but the test title still said 'abandoned liveness state', which made failures and test output misleading. Rename the test title to match the asserted value. No code change beyond the it(...) title. Validation: pnpm test src/agents/pi-embedded-runner/run.incomplete-turn.test.ts (19/19 pass). Refs openclaw#64227 * agents: widen strict-agentic auto-activation to handle prefixed and variant GPT-5 model ids * Align strict-agentic retry matching * runtime: harden strict-agentic model matching --------- Co-authored-by: Eva <eva@100yen.org>

Copilot AI review requested due to automatic review settings April 12, 2026 06:57

openclaw-barnacle Bot added the agents Agent runtime and tooling label Apr 12, 2026

This was referenced Apr 12, 2026

GPT-5.4 runtime completion rollup #65217

Closed

GPT-5.4 / Codex agentic runtime parity in OpenClaw #64227

Closed

openclaw-barnacle Bot added the size: L label Apr 12, 2026

Copilot started reviewing on behalf of 100yenadmin April 12, 2026 06:58 View session

Copilot AI reviewed Apr 12, 2026

View reviewed changes

Comment thread src/agents/execution-contract.ts Outdated

greptile-apps Bot reviewed Apr 12, 2026

View reviewed changes

Comment thread src/agents/execution-contract.ts Outdated

This was referenced Apr 12, 2026

agents: GPT-5.4 parity proof rollup #65224

Closed

agents: strengthen GPT-5.4 execution bias and close the one-action-then-narrative loophole #65257

Closed

100yenadmin requested a review from Copilot April 12, 2026 10:00

Copilot started reviewing on behalf of 100yenadmin April 12, 2026 10:00 View session

Copilot AI reviewed Apr 12, 2026

View reviewed changes

100yenadmin requested a review from Copilot April 12, 2026 10:32

Copilot started reviewing on behalf of 100yenadmin April 12, 2026 10:32 View session

Copilot AI reviewed Apr 12, 2026

View reviewed changes

100yenadmin changed the title ~~GPT-5.4 runtime completion rollup~~ agents: GPT-5.4 runtime completion rollup Apr 12, 2026

Eva added 6 commits April 12, 2026 15:37

agents: widen strict-agentic auto-activation to handle prefixed and v…

bda1092

…ariant GPT-5 model ids

Align strict-agentic retry matching

5a2177c

runtime: harden strict-agentic model matching

dbfe0a9

pashpashpash force-pushed the rollup/gpt54-runtime-completion branch from bc8da23 to dbfe0a9 Compare April 12, 2026 22:40

pashpashpash merged commit 26945dd into openclaw:main Apr 12, 2026
41 checks passed

github-actions Bot mentioned this pull request Apr 13, 2026

📡 Upstream Digest — 2026-04-13 01:56 UTC curtismercier/openclaw-mods#555

Open

gugu91 mentioned this pull request Apr 15, 2026

research: investigate OpenClaw changes that improved OAI model behavior gugu91/extensions#419

Closed

clawsweeper Bot mentioned this pull request Apr 30, 2026

agents: surface livenessState in user-facing payloads #64887

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

agents: GPT-5.4 runtime completion rollup #65219

agents: GPT-5.4 runtime completion rollup #65219
pashpashpash merged 6 commits intoopenclaw:mainfrom
electricsheephq:rollup/gpt54-runtime-completion

100yenadmin commented Apr 12, 2026 •

edited

Loading

Uh oh!

100yenadmin commented Apr 12, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

greptile-apps Bot commented Apr 12, 2026

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

100yenadmin commented Apr 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What changed

How this PR relates to the others

Review status

Uh oh!

100yenadmin commented Apr 12, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

greptile-apps Bot commented Apr 12, 2026

Greptile Summary

Confidence Score: 5/5

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

100yenadmin commented Apr 12, 2026 •

edited

Loading