[codex] Harden GPT-5.4 runtime paths by 100yenadmin · Pull Request #70743 · openclaw/openclaw

100yenadmin · 2026-04-23T19:10:59Z

Summary

This PR hardens the GPT-5.4 embedded-agent hot path after auditing v2026.4.22. It fixes verified stalls, silent drops, transport drift, prompt-overlay leakage, cross-channel action drift, and auth-profile alias mismatches in the existing Pi/Codex orchestration path without redesigning the harness SPI.

This is the point-fix PR. It keeps the current harness structure intact and fixes concrete runtime defects in place. The follow-up additive extension-seam work is in #70772.

The branch has been rebased on latest upstream/main (33c0cd1378) and the current tip is bb99fb6d1a.

Runtime Routing Map

Selecting GPT-5.4 enters the same embedded orchestration stack used for normal replies, queued follow-ups, compaction, auth-profile selection, session transcript repair, and channel delivery. openai/* and openai-codex/* still use the built-in Pi/OpenAI path. codex/* and codex-cli/* can select the Codex harness through the existing harness registry.

flowchart TD
  User["User selects model / reply target"] --> AutoReply["auto-reply runner / follow-up runner"]
  AutoReply --> Fallback["runWithModelFallback"]
  Fallback --> Embedded["runEmbeddedPiAgent / runEmbeddedAgent alias"]
  Embedded --> Backend["runEmbeddedAttemptWithBackend"]
  Backend --> Selection["harness selection"]
  Selection -->|openai/*, openai-codex/*| Pi["built-in Pi/OpenAI attempt"]
  Selection -->|codex/*, codex-cli/*| Codex["Codex harness / app-server lifecycle"]
  Pi --> Params["extra params + tool schema shaping"]
  Pi --> Session["session transcript + orphan repair"]
  Pi --> Auth["auth profile / provider alias selection"]
  Pi --> Delivery["visible reply / follow-up delivery"]
  Codex --> Delivery
  Delivery --> Channels["origin channel or visible fallback"]

Failure Classes Fixed

Area	Before	After	Primary files
GPT-5.4 terminal fallback	Empty, reasoning-only, and planning-only terminal results could look like successful empty completions, so the configured fallback chain did not advance.	Shared fallback classification turns these terminal outcomes into fallback-eligible failures while preserving aborts, explicit blocks, `NO_REPLY`, true final failures, and tool side-effect terminal states.	`src/agents/model-fallback.ts`, `src/agents/pi-embedded-runner/result-fallback-classifier.ts`, `src/auto-reply/reply/agent-runner-execution.ts`, `src/auto-reply/reply/followup-runner.ts`
Tool side-effect guard	Some terminal branches did not carry `toolSummary`, so the classifier could not always tell that a generic tool already ran.	`toolSummary` is built once from `attempt.toolMetas` and propagated through timeout, block, reasoning-only, incomplete-turn, and success metadata.	`src/agents/pi-embedded-runner/run.ts`, `src/agents/model-fallback.run-embedded.e2e.test.ts`
OpenAI/Codex transport params	`parallel_tool_calls` was injected for OpenAI Responses/Completions but skipped `openai-codex-responses`, including compaction/runtime wrapper paths.	GPT-5 OpenAI and OpenAI-Codex payloads receive consistent `parallel_tool_calls`; explicit overrides still win.	`src/agents/provider-api-families.ts`, `src/agents/pi-embedded-runner/extra-params.ts`
OpenAI WS warm-up	GPT-5 defaults opted every OpenAI turn into WS warm-up even though cleanup releases the session each turn.	Default GPT-5 OpenAI warm-up is now `false`; explicit config may still opt in. Pooling remains follow-up/gated work.	`src/agents/pi-embedded-runner/extra-params.ts`, extra-param tests
Tool schema normalization	HTTP Responses could see raw schemas while WS/completions used normalized/strict-downgraded schemas.	Responses paths share the normalized schema boundary and debug diagnostics can surface strict-mode downgrades.	`src/agents/openai-tool-schema.ts`, `src/agents/openai-transport-stream.ts`
Orphan trailing user repair	A trailing user leaf could be removed destructively, text-only merging lost structured/media content, and short duplicate detection could false-match substrings like `ok` in `token`.	Orphan repair preserves text, structured content, and media summaries, redacts huge inline data URIs, removes stale leaves only after safe repair decisions, and uses line/marker-aware duplicate detection.	`src/agents/pi-embedded-runner/run/attempt.prompt-helpers.ts`, `src/agents/pi-embedded-runner/run/attempt.ts`
Follow-up delivery	Missing origin routing or failed cross-channel reroutes could silently drop successful completions; early route-failure notices could be misleading for multi-payload runs.	Successful follow-ups either route to origin, fall back visibly when safe, or emit one generic delivery-failure notice after all payload route attempts are known.	`src/auto-reply/reply/followup-runner.ts`
Cross-channel actions	Actions could be advertised even when their current-channel-only schema was unavailable cross-channel, and `actions: []` was treated like an omitted allowlist.	Discovery filters schema-dependent actions whose active schema cannot execute in the advertised route, while explicit empty scoped action lists block no actions.	`src/channels/plugins/message-action-discovery.ts`, `src/channels/plugins/message-actions.test.ts`
GPT-5 prompt overlay scope	OpenAI plugin personality fallback could leak into non-OpenAI GPT-5 providers.	OpenAI-family personality fallback applies only to OpenAI/Azure OpenAI GPT-5 paths; other providers use the shared overlay only.	`src/agents/gpt5-prompt-overlay.ts`, `src/plugins/provider-runtime.ts`
Auth profile aliases	`codex-cli/gpt-5.4`, `openai-codex/*`, session overrides, CLI handoff, and embedded runner lock checks could compare different provider strings for the same auth profile family.	Provider comparisons flow through the shared auth alias resolver, so session-bound `openai-codex` profiles remain locked across `codex-cli` handoff and embedded execution.	`src/agents/provider-auth-aliases.ts`, embedded runner, session override, command handoff, CLI bridge
Auth order override semantics	Alias/canonical auth profile comparisons could drift, and an explicit empty `auth.order.<provider> = []` must still mean "use no stored profiles".	Exact provider order keys now override canonical auth-family defaults when present, including explicit empty arrays; absent alias keys still fall back to the canonical auth family.	`src/agents/auth-profiles/order.ts`, auth order tests

GPT-5.4 Fallback Flow

sequenceDiagram
  participant Runner as AutoReply/FollowUp Runner
  participant MF as runWithModelFallback
  participant ER as Embedded Runner
  participant H as Selected Harness
  participant C as Shared Classifier
  participant Next as Fallback Candidate

  Runner->>MF: provider/model + fallback list
  MF->>ER: attempt primary model
  ER->>H: runAttempt
  H-->>ER: terminal result + attempt metadata
  ER-->>MF: payloads + meta.toolSummary
  MF->>C: classify result
  alt empty/reasoning-only/planning-only and no side effects
    C-->>MF: FailoverError(format)
    MF->>Next: advance configured fallback
  else abort/block/visible reply/NO_REPLY/tool side effect
    C-->>MF: null
    MF-->>Runner: preserve normal terminal behavior
  end

Channel, Session, And Auth Delivery Flow

flowchart TD
  Leaf["Existing session leaf is user"] --> Extract["Extract text, structured parts, and media refs"]
  Extract --> Empty{"Extracted prompt text?"}
  Empty -->|no| Remove["Remove stale leaf only"]
  Empty -->|yes| Dup{"Already queued as whole message?"}
  Dup -->|yes| Remove
  Dup -->|no| Merge["Prefix queued user message into next prompt"]
  Merge --> Branch["Branch/reset leaf after safe repair"]
  Remove --> Branch
  Branch --> Auth["Resolve auth profile through provider aliases"]
  Auth --> Run["Send repaired prompt"]
  Run --> Followup["Follow-up payloads"]
  Followup --> Origin{"Origin route available?"}
  Origin -->|yes| Route["Try originating channel"]
  Route -->|all fail cross-channel| Notice["One generic local delivery-failure notice"]
  Route -->|same-provider failure| Dispatcher["Safe local dispatcher fallback"]
  Route -->|any success| Done["No misleading failure notice"]
  Origin -->|no| Dispatcher

Safety Boundaries

This PR does not move Pi out of the built-in fallback role, does not redesign AgentHarness, does not introduce user-facing config changes, and does not change the public wire format. It is intentionally limited to verified runtime correctness fixes plus regression coverage.

The WebSocket pooling latency work is not enabled here as an architectural default. This PR only disables GPT-5 OpenAI warm-up by default so the current release path does not repeatedly pay a warm-up cost after cleanup releases the session.

Related Work And Issue Map

This PR intentionally does not use Closes: for broad GPT-5.4/Codex tickets unless the exact reported scenario is covered. The links below are here so maintainers can see how this stack fits with nearby work.

Link	Relationship
#41282	Historical openai-codex/GPT-5.4 timeout/stall report. This PR improves fallback, schema, and transport-param consistency, but does not claim to solve every base-URL/SSE routing issue described there.
#64251	CLI-backed `codex-cli/gpt-5.4` follow-up instability. This PR helps by normalizing auth aliases and preventing successful follow-up payload drops.
#51063 / #65152	OpenAI-Codex tool execution/tool-definition symptoms. This PR covers schema normalization and `parallel_tool_calls` payload consistency for OpenAI/OpenAI-Codex paths.
#65844 / #57286 / #63856	OpenAI-Codex auth profile/order drift. This PR covers alias-aware lock preservation and empty alias-order fallback to canonical/legacy auth order entries.
#59928 / #65234 / #54698	Fallback-chain/session-model issues. This PR is narrower: it classifies GPT-5.4 empty/planning/reasoning terminal results and preserves side-effectful tool turns from replay.
#45761 / #60830 / #59680	Prior fallback classifier hardening. This PR builds on that line by adding GPT-5.4 embedded terminal classification and side-effect guards.
#52903 / #63608	Prior retry/session transcript integrity work. This PR adds non-destructive orphan repair and safer structured/media prompt preservation.
#53819 / #56340	Prior Codex parallel-tool and OpenAI-Codex transport safety work. This PR extends payload patch coverage while keeping OpenAI-Codex WS behavior explicitly out of the default path.
#70904 / #70911 / #63369	Adjacent reasoning-effort injection issue. Not fixed here; #70911 is the focused PR for missing `body.reasoning` when OpenAI/Codex Responses payloads start with `reasoning: undefined`.
#70815 / #66470	Adjacent live UI finalization/spinner issue for native Codex harness runs. Not fixed here; this PR focuses backend delivery/fallback semantics.
#69453 / #55461 / #42225	Adjacent GPT-5.4 context-window/catalog mismatch issues. Not fixed here.
#56487 / #50647 / #57917	Adjacent UI/model-picker provider-prefix issues. Not fixed here.

Live Search Additions (2026-04-24)

I re-ran live GitHub search across GPT-5.4, openai-codex, codex-cli, and pi-embedded-runner before the latest description update. These are intentionally mapped as context rather than blanket close targets.

Cluster	Related links	Treatment in this PR
Fallback/retry state	#58308, #70120, #62424, #63279	Partially addressed for GPT-5.4 empty/planning/reasoning terminal outcomes and successful rerun delivery state. Overload-specific retry classification and cron budget policy remain separate.
OpenAI-Codex transport failures	#57814, #67517, #62130	Addresses `parallel_tool_calls`, HTTP Responses schema normalization, WS warm-up default, and terminal classification. Does not claim to fix Cloudflare/base-url/network failures.
Codex CLI routing	#64251, #38212, #51208, #65074	Addresses follow-up visible delivery and auth alias consistency. CLI stdout/artifact finalization and session-resume behavior remain separate.
Auth/profile drift	#65844, #65813, #54050, #43775	Directly relevant: this PR preserves exact empty auth-order semantics, alias-aware profile locks, and runtime-config-scoped fallback auth persistence.
Embedded runner integrity	#64570, #64888, #67878, #68329	Addresses GPT-5.4 thinking/reasoning-only fallback classification and orphan repair. Broader cancellation/liveness and CLI compaction remain separate.
Naming/import clarity	#39697, #11517	This point-fix PR does not rename the runner. #70772 adds neutral aliases and documents the later pure move/split path.

Latest Validation

Post-rebase verification on the final branch:

Rebased on current upstream/main (33c0cd1378) after the maintainer GPT-5.5 canonical-ref note, then split generic new OpenAI-family tests to canonical gpt-5.5 while leaving gpt-5.4/codex-cli refs only as explicit regression or legacy-compat coverage.
node scripts/run-vitest.mjs run --config test/vitest/vitest.auto-reply.config.ts src/auto-reply/reply/agent-runner-execution.test.ts src/auto-reply/reply/followup-runner.test.ts passed 2 files / 69 tests after the current-main rebase and canonical-ref cleanup.
node scripts/run-vitest.mjs run --config test/vitest/vitest.agents.config.ts src/agents/openai-transport-stream.test.ts src/agents/pi-embedded-runner-extraparams.test.ts src/agents/model-fallback.test.ts src/agents/command/attempt-execution.cli.test.ts src/agents/agent-command.live-model-switch.test.ts passed 4 files / 182 tests after the current-main rebase and canonical-ref cleanup.
node scripts/run-vitest.mjs run --config test/vitest/vitest.plugins.config.ts src/plugins/provider-runtime.test.ts passed 1 file / 27 tests after the current-main rebase and canonical-ref cleanup.
node scripts/run-vitest.mjs run --config test/vitest/vitest.auto-reply.config.ts src/auto-reply/reply/agent-runner-execution.test.ts src/auto-reply/reply/followup-runner.test.ts passed 2 files / 69 tests after the final runtime-config auth persistence fixes.
node scripts/run-vitest.mjs run --config test/vitest/vitest.agents.config.ts src/agents/command/attempt-execution.cli.test.ts src/agents/pi-embedded-runner-extraparams.test.ts src/agents/pi-embedded-runner-extraparams-resolve.test.ts src/agents/model-fallback.test.ts src/agents/auth-profiles/order.test.ts src/agents/auth-profiles.resolve-auth-profile-order.uses-stored-profiles-no-config-exists.test.ts src/agents/auth-profiles/session-override.test.ts src/agents/provider-auth-aliases.test.ts src/agents/agent-command.live-model-switch.test.ts passed 7 files / 192 tests.
node scripts/run-vitest.mjs run --config test/vitest/vitest.auto-reply.config.ts src/auto-reply/reply/followup-runner.test.ts passed 1 file / 23 tests.
node scripts/run-vitest.mjs run --config test/vitest/vitest.e2e.config.ts src/agents/model-fallback.run-embedded.e2e.test.ts passed 1 file / 17 tests.

Earlier focused/broad local verification on this PR also covered:

pnpm lint
pnpm tsgo:core:test
node scripts/run-vitest.mjs run --config test/vitest/vitest.full-core-support-boundary.config.ts test/scripts/lint-suppressions.test.ts
node scripts/run-vitest.mjs run --config test/vitest/vitest.auto-reply.config.ts src/auto-reply/reply/agent-runner-execution.test.ts src/auto-reply/reply/followup-runner.test.ts
node scripts/run-vitest.mjs run --config test/vitest/vitest.agents.config.ts src/agents/model-fallback.test.ts src/agents/pi-embedded-runner/run/attempt.test.ts src/agents/pi-embedded-runner-extraparams.test.ts src/agents/openai-transport-stream.test.ts src/agents/auth-profiles/session-override.test.ts src/agents/auth-profiles/order.test.ts src/agents/command/attempt-execution.cli.test.ts src/agents/provider-auth-aliases.test.ts src/agents/tools/message-tool.test.ts src/agents/agent-command.live-model-switch.test.ts src/plugins/provider-runtime.test.ts
node scripts/run-vitest.mjs run --config test/vitest/vitest.channels.config.ts src/channels/plugins/message-actions.test.ts
OPENCLAW_VITEST_NO_OUTPUT_TIMEOUT_MS=0 node scripts/run-vitest.mjs run --config test/vitest/vitest.extension-messaging.config.ts
pnpm exec oxfmt --check <changed files>
git diff --check

Review State

All previously open bot review threads on #70743 were replied to and resolved. The final review-fix commits after the latest rebase are:

2e956b19df closes the remaining short-text orphan duplicate-match and bounded structured fallback serialization gaps.
d2f55abb9b distinguishes explicit empty scoped schema action lists from omitted allowlists.
961567766a preserves aliased embedded auth locks.
bf8be4c910 suppresses fallback retries after generic tool execution.
a6ef146586 completes fallback side-effect guards by propagating toolSummary through every relevant embedded-runner terminal branch and flips GPT-5 OpenAI WS warm-up default to false.
35f7c348e9 updates the rebased CLI attempt-execution test mock for upstream's provider auth alias-map export.
10b74a4459 addresses fresh bot review by keeping stripped NO_REPLY terminal turns out of fallback and preserving explicit empty auth-order overrides, including exact alias keys such as codex-cli: [].
f73022e4f4 addresses fresh follow-up routing review by emitting a visible partial-delivery notice when any cross-channel payload fails, even if another payload in the same completion routes successfully.
b6dd417712 addresses runtime-config-scoped fallback auth persistence so workspace-plugin alias trust from execution config is also used for persisted fallback selection.
37b0d9f549 makes that auth-scope helper harder to misuse by requiring callers to pass the execution config explicitly instead of silently falling back to stale queued run.config.
bb99fb6d1a responds to the maintainer GPT-5.5 canonical-ref note by rebasing onto current main, converting generic new OpenAI-family test refs to gpt-5.5, and documenting remaining gpt-5.4/codex-cli refs as intentional regression or legacy-compat coverage.

Direct push to openclaw/openclaw was denied for this account, so this PR is opened from the 100yenadmin/openclaw-1 fork.

greptile-apps · 2026-04-23T19:16:29Z

Greptile Summary

This PR hardens multiple GPT-5.4 runtime paths: it adds a classifyResult hook to the model-fallback loop so that empty/reasoning-only/plan-only terminal results trigger configured fallbacks; it normalises tool-schema strict-mode handling and adds diagnostic logging across both the Completions and Responses transports; it preserves orphaned user payloads instead of silently removing them; and it fixes auth-profile alias resolution by routing provider comparisons through resolveProviderIdForAuth.

The GPT-5 terminal result classifier is implemented twice—once in result-fallback-classifier.ts (imported by followup-runner) and again locally in agent-runner-execution.ts—with divergent isGpt5ModelId regexes and case-sensitive vs. case-insensitive plan-only string matching; this creates a maintenance surface where fixing a detection gap in one path silently leaves the other unchanged.

Confidence Score: 4/5

Safe to merge with minor cleanup; the duplicate classifier logic is a maintainability risk but not an immediate production blocker.

All inline findings are P2, but the duplicated classification code in agent-runner-execution.ts uses a case-sensitive substring match for the plan-only signal while the canonical classifier uses a case-insensitive regex — a real behavioral divergence introduced in this PR that is worth resolving before the code drifts further.

src/auto-reply/reply/agent-runner-execution.ts — duplicate classifier with inconsistent regex; src/agents/pi-embedded-runner/result-fallback-classifier.ts — the intended canonical home for this logic

Prompt To Fix All With AI

This is a comment left during a code review.
Path: src/auto-reply/reply/agent-runner-execution.ts
Line: 122-125

Comment:
**Local `isGpt5ModelId` diverges from the canonical helper**

A local copy of `isGpt5ModelId` is defined with the pattern `^gpt-5(?:[.-]|$)`, while the canonical version imported by `result-fallback-classifier.ts` uses `(?:^|[/:])gpt-5(?:[.-]|$)`. The canonical regex also matches IDs that include a provider prefix (`openai/gpt-5.4`), which the local copy would miss. Since `result-fallback-classifier.ts` was introduced in this same PR as the shared classification helper, consider importing `isGpt5ModelId` from `gpt5-prompt-overlay.ts` here too instead of maintaining a second copy.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: src/auto-reply/reply/agent-runner-execution.ts
Line: 128-194

Comment:
**Duplicate GPT-5 terminal classification logic with subtle behavioral differences**

`classifyEmbeddedTerminalResultForFallback` here reimplements the same classification that `classifyEmbeddedPiRunResultForModelFallback` in `result-fallback-classifier.ts` provides, but with subtle divergences:

- Plan-only detection: this file uses `errorText.includes("Agent stopped after repeated plan-only turns")` (case-sensitive substring), while `result-fallback-classifier.ts` uses `/Agent stopped after repeated plan-only turns/i` (case-insensitive regex).
- Side-effect guards: this file additionally checks `directlySentBlockKeys.size > 0` and inspects `isSilentReplyText`/`isSilentReplyPrefixText`, which are absent from the shared classifier.

The mismatch in string matching means a lowercase variant of the plan-only message would trigger a fallback in `followup-runner` but not in `agent-runner-execution`. Consider unifying these by extending `classifyEmbeddedPiRunResultForModelFallback` to accept an optional `directlySentBlockKeys` context, or extracting the shared core into a common function both can delegate to.

How can I resolve this? If you propose a fix, please make it concise.

_{Reviews (1): Last reviewed commit: "fix: harden gpt-5.4 runtime paths" | Re-trigger Greptile}

Copilot

Pull request overview

Hardens several GPT‑5.4/Codex runtime paths across embedded Pi, CLI harnesses, provider transports, session delivery, and status/reporting so terminal “successful” but unusable results can advance through configured fallbacks and runtime surfaces stay consistent.

Changes:

Add result classification hooks to model fallback (and GPT‑5 terminal-result detection) so empty/reasoning-only/plan-only embedded runs can trigger fallbacks.
Normalize OpenAI/Codex transport/tool-schema behavior and improve status output (runner label + fast-mode label + thinking-default alignment).
Improve operational robustness across sessions, plugin install allowlists, channel catalogs, and Azure OpenAI image-generation routing.

Reviewed changes

Copilot reviewed 100 out of 100 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
test/official-channel-catalog.test.ts	Makes catalog tests tolerant to additional official entries.
src/status/status-text.ts	Uses explicit thinking default and aligns resolved think fallback order.
src/status/status-message.ts	Adds `Runner:` line and simplifies fast-mode label to `Fast`.
src/status/status-message.test.ts	Adds unit tests for fast-mode label behavior.
src/plugins/provider-runtime.ts	Passes provider id into GPT‑5 prompt overlay resolution.
src/plugins/provider-runtime.test.ts	Tests OpenAI personality fallback scoping for GPT‑5 providers.
src/plugins/provider-install-catalog.ts	Relaxes trusted npmSpec exposure rules for config/bundled origins.
src/plugins/provider-install-catalog.test.ts	Adds coverage for unpinned trusted npm specs.
src/plugins/contracts/boundary-invariants.test.ts	Adds invariants to keep live provider config lookups on runtime config.
src/plugin-sdk/runtime-config-snapshot.ts	Exposes clear/set runtime config snapshot APIs via SDK.
src/infra/package-dist-inventory.ts	Centralizes legacy QA channel runtime-api path constant.
src/infra/npm-update-compat-sidecars.ts	Exports legacy QA channel runtime-api sidecar path and reuses it.
src/gateway/model-pricing-cache.ts	Extends pricing fetch timeout and defers bootstrap refresh via microtask.
src/gateway/model-pricing-cache.test.ts	Tests deferred bootstrap refresh and updated timeout logging text.
src/gateway/gateway-codex-harness.live.test.ts	Tweaks live probe instructions for shell-tool execution.
src/gateway/gateway-codex-harness.live-helpers.ts	Recognizes missing `codex` PATH fallback text.
src/gateway/gateway-codex-harness.live-helpers.test.ts	Adds tests for missing `codex` PATH fallback acceptance.
src/gateway/gateway-cli-backend.live.test.ts	Adds request/agent timeouts and CI-safe Codex probe skip gate.
src/gateway/gateway-acp-bind.live.test.ts	Refines retry/timeout handling for ACP bind live assertions.
src/config/sessions/store.ts	Preserves active session key during maintenance; route updates preserve activity timestamps.
src/config/sessions/store-maintenance.ts	Adds `preserveKeys` support to prune/cap maintenance operations.
src/config/sessions.test.ts	Updates expectations and adds regression test for `updateLastRoute` not bumping `updatedAt`.
src/config/bundled-channel-config-metadata.generated.ts	Updates generated channel schema metadata (toolProgress + execApprovals + policies).
src/commands/onboarding-plugin-install.ts	Allows onboarding npm installs for trusted registry specs without strict pins.
src/commands/onboarding-plugin-install.test.ts	Updates onboarding tests for relaxed npm spec requirements.
src/cli/plugins-install-persist.ts	Adds installed plugin id to existing `plugins.allow` list before enabling.
src/cli/plugins-install-persist.test.ts	Tests allowlist augmentation behavior during install persistence.
src/channels/plugins/message-action-discovery.ts	Filters cross-channel advertised actions when schema visibility is current-channel only.
src/channels/plugins/contracts/channel-catalog.contract.test.ts	Adds contract coverage for WeCom channel catalog entry.
src/channels/plugins/catalog.ts	Loads built-in official external channel catalog JSON plus file-based official catalog.
src/auto-reply/thinking.ts	Raises implicit reasoning defaults to “medium” and remaps to supported level.
src/auto-reply/thinking.test.ts	Updates tests for new implicit default + remapping behavior.
src/auto-reply/status.test.ts	Extends `/status` tests for runner label + fast-mode visibility rules.
src/auto-reply/reply/model-selection.ts	Hydrates runtime catalog metadata for thinking when allowlist entries omit reasoning.
src/auto-reply/reply/model-selection.test.ts	Adds tests for implicit thinking default + runtime hydration behavior.
src/auto-reply/reply/followup-runner.ts	Classifies embedded results for fallback and avoids silently dropping followups.
src/auto-reply/reply/followup-runner.test.ts	Updates tests for dispatcher fallback behavior when origin routing fails/is incomplete.
src/auto-reply/reply/commands-types.ts	Adds `resolvedFastMode` to command handler params.
src/auto-reply/reply/commands-info.ts	Forwards resolved fast mode into `/status` reply build.
src/auto-reply/reply/commands-info.test.ts	Tests forwarding of resolved fast mode into `/status`.
src/auto-reply/reply/agent-runner-execution.ts	Adds GPT‑5 embedded terminal-result fallback classification + auth-profile alias normalization.
src/auto-reply/reply/agent-runner-execution.test.ts	Adds tests for plan-only/silent/streamed-block classification behavior.
src/auto-reply/reply/agent-runner-auth-profile.ts	Normalizes provider ids for auth-profile scoping across aliases.
src/agents/tools/session-status-tool.ts	Aligns status tool thinking-default resolution with configured/runtime catalogs.
src/agents/tools/message-tool.ts	Uses cross-channel schema-supported action listing to avoid hidden params.
src/agents/tools/message-tool.test.ts	Tests that cross-channel actions with current-channel-only schema are not advertised.
src/agents/provider-auth-aliases.ts	Dedupes/centralizes alias selection and treats deprecated auth choice ids as aliases.
src/agents/provider-auth-aliases.test.ts	Adds tests for deprecated auth choice ids mapping to provider auth key.
src/agents/pi-embedded-runner/run/attempt.ts	Preserves orphaned user payloads and conditionally removes leaf based on preservation.
src/agents/pi-embedded-runner/run/attempt.test.ts	Adds coverage for structured orphan preservation and removeLeaf behavior.
src/agents/pi-embedded-runner/run/attempt.prompt-helpers.ts	Extracts structured content into prompt text; returns `removeLeaf` decision.
src/agents/pi-embedded-runner/result-fallback-classifier.ts	New classifier for embedded Pi GPT‑5 terminal results.
src/agents/pi-embedded-runner/extra-params.ts	Extends `parallel_tool_calls` injection to `openai-codex-responses` API.
src/agents/pi-embedded-runner-extraparams.test.ts	Adds tests for Codex Responses parallel tool call injection.
src/agents/openclaw-tools.session-status.test.ts	Adds session-status tool tests for implicit fallback keys and thinking defaults.
src/agents/openai-transport-stream.ts	Normalizes tool parameters even when strict omitted; adds diagnostics when strict downgraded.
src/agents/openai-transport-stream.test.ts	Adds tests for responses tool param normalization and strict downgrade behavior.
src/agents/openai-tool-schema.ts	Adds strict schema diagnostics helpers and violation discovery.
src/agents/model-thinking-default.ts	Switches thinking-default import to runtime `thinking.ts`.
src/agents/model-selection.test.ts	Updates tests to reflect default thinking level changes.
src/agents/model-fallback.ts	Adds optional result-classification hook to drive fallback progression on terminal “ok” results.
src/agents/model-fallback.test.ts	Adds tests for fallback progression via result classification.
src/agents/gpt5-prompt-overlay.ts	Scopes OpenAI plugin personality fallback to OpenAI-family GPT‑5 providers.
src/agents/command/attempt-execution.ts	Normalizes provider ids for auth-profile override compatibility across aliases.
src/agents/command/attempt-execution.cli.test.ts	Adds Codex CLI alias coverage for auth-profile propagation.
src/agents/auth-profiles/session-override.ts	Matches session auth profiles to providers using auth-alias normalization.
src/agents/auth-profiles/session-override.test.ts	Tests session override preservation under CLI provider aliasing.
scripts/write-official-channel-catalog.mjs	Seeds generated catalog with built-in external entries + additional install metadata.
scripts/test-live-cli-backend-docker.sh	Defaults CI-safe Codex config for codex-cli live tests and prints it.
scripts/test-built-bundled-channel-entry-smoke.mjs	Improves error reporting when bundled entries fail to import.
scripts/lib/official-external-channel-catalog.json	Adds built-in external catalog entry for WeCom plugin.
scripts/lib/docker-e2e-logs.sh	Normalizes TMPDIR handling and mktemp template usage.
scripts/e2e/plugins-docker.sh	Adjusts mktemp template usage for run logs.
scripts/e2e/parallels-macos-smoke.sh	Improves Discord smoke visibility checks and config key naming.
scripts/e2e/npm-onboard-channel-agent-docker.sh	Adjusts mktemp template usage for run logs.
extensions/voice-call/src/manager/store.ts	Tracks pending append writes and adds test flush helper.
extensions/voice-call/src/manager/events.test.ts	Awaits pending persist writes before removing test store directory.
extensions/telegram/src/bot/helpers.ts	Adds bounded TTL cache for forum flag lookups.
extensions/telegram/src/bot/helpers.test.ts	Adds cache reset and cache behavior tests for forum flag resolution.
extensions/telegram/src/bot.create-telegram-bot.test.ts	Resets forum flag cache before bot tests.
extensions/telegram/src/bot-native-commands.test.ts	Resets forum flag cache before native command tests.
extensions/qa-lab/src/live-transports/telegram/telegram-live.runtime.ts	Records batch observation time for live Telegram probe RTT reporting.
extensions/openai/image-generation-provider.ts	Adds Azure OpenAI image endpoint detection + auth/header/url shaping.
extensions/openai/image-generation-provider.test.ts	Adds Azure image-generation routing tests (hosts + api-version override).
extensions/discord/subagent-hooks-api.ts	Removes re-exports from subagent hooks API surface.
extensions/discord/src/monitor/thread-bindings.lifecycle.test.ts	Switches to runtime-config snapshot SDK entrypoint.
extensions/discord/src/monitor/message-handler.ts	Updates SDK import for runtime group policy resolution.
extensions/discord/src/monitor/message-handler.process.ts	Splits SDK imports across new runtime entrypoints.
extensions/discord/src/monitor/message-handler.process.test.ts	Updates mocks to new session-store runtime SDK module.
extensions/codex/provider.ts	Uses runtime snapshot plugin config for discovery toggles.
extensions/codex/provider.test.ts	Tests live config re-enabling discovery after startup disable.
docs/tools/thinking.md	Documents `/status` fast-mode label behavior.
docs/tools/plugin.md	Documents allowlist augmentation on plugin install.
docs/tools/image-generation.md	Links to Azure OpenAI endpoint docs for image generation.
docs/providers/openai.md	Adds Azure OpenAI image endpoint configuration documentation.
docs/plugins/sdk-setup.md	Updates onboarding guidance for registry npm specs and optional integrity pins.
docs/plugins/manifest.md	Updates manifest docs for unpinned npm specs + optional integrity.
docs/.generated/config-baseline.sha256	Updates config baseline hashes.
CHANGELOG.md	Documents new runner label, thinking/fast-mode/status changes, and Azure image support.
.agents/skills/openclaw-release-maintainer/SKILL.md	Updates release guidance around beta tag handling and release notes completeness.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f184258184

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

100yenadmin marked this pull request as ready for review April 23, 2026 19:12

100yenadmin requested a review from a team as a code owner April 23, 2026 19:12

Copilot AI review requested due to automatic review settings April 23, 2026 19:12

Copilot started reviewing on behalf of 100yenadmin April 23, 2026 19:12 View session

greptile-apps Bot reviewed Apr 23, 2026

View reviewed changes

Comment thread src/auto-reply/reply/agent-runner-execution.ts Outdated

Comment thread src/auto-reply/reply/agent-runner-execution.ts Outdated

Copilot AI reviewed Apr 23, 2026

View reviewed changes

Comment thread src/channels/plugins/catalog.ts

chatgpt-codex-connector Bot reviewed Apr 23, 2026

View reviewed changes

Comment thread src/auto-reply/reply/agent-runner-execution.ts Outdated

Comment thread src/agents/pi-embedded-runner/run/attempt.prompt-helpers.ts Outdated

100yenadmin force-pushed the bugfix/gpt54-runtime-audit-v2026-4-22-v2 branch from f184258 to 3d385a3 Compare April 23, 2026 19:24

100yenadmin mentioned this pull request Apr 25, 2026

[codex] Consolidate RuntimePlan and Harness V2 package #71722

Merged

clawsweeper Bot mentioned this pull request Apr 26, 2026

codex-cli/gpt-5.4 fails in embedded/helper paths while openai-codex/gpt-5.4 works #38212

Open

This was referenced Apr 27, 2026

openai-codex auth profile rotation burns through both profiles before escalating to model fallback #65813

Closed

Config version stamp mismatch silently breaks Discord delivery for openai-codex models #43775

Closed

100yenadmin mentioned this pull request Apr 28, 2026

[codex] Finalize RuntimePlan embedded-runner cleanup stack #73767

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[codex] Harden GPT-5.4 runtime paths#70743

[codex] Harden GPT-5.4 runtime paths#70743
steipete merged 19 commits intoopenclaw:mainfrom
electricsheephq:bugfix/gpt54-runtime-audit-v2026-4-22-v2

100yenadmin commented Apr 23, 2026 •

edited

Loading

Uh oh!

greptile-apps Bot commented Apr 23, 2026

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

100yenadmin commented Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Runtime Routing Map

Failure Classes Fixed

GPT-5.4 Fallback Flow

Channel, Session, And Auth Delivery Flow

Safety Boundaries

Related Work And Issue Map

Live Search Additions (2026-04-24)

Latest Validation

Review State

Uh oh!

greptile-apps Bot commented Apr 23, 2026

Greptile Summary

Confidence Score: 4/5

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

100yenadmin commented Apr 23, 2026 •

edited

Loading