Skip to content

[codex] Harden GPT-5.4 runtime paths#70743

Merged
steipete merged 19 commits intoopenclaw:mainfrom
electricsheephq:bugfix/gpt54-runtime-audit-v2026-4-22-v2
Apr 24, 2026
Merged

[codex] Harden GPT-5.4 runtime paths#70743
steipete merged 19 commits intoopenclaw:mainfrom
electricsheephq:bugfix/gpt54-runtime-audit-v2026-4-22-v2

Conversation

@100yenadmin
Copy link
Copy Markdown
Contributor

@100yenadmin 100yenadmin commented Apr 23, 2026

Summary

This PR hardens the GPT-5.4 embedded-agent hot path after auditing v2026.4.22. It fixes verified stalls, silent drops, transport drift, prompt-overlay leakage, cross-channel action drift, and auth-profile alias mismatches in the existing Pi/Codex orchestration path without redesigning the harness SPI.

This is the point-fix PR. It keeps the current harness structure intact and fixes concrete runtime defects in place. The follow-up additive extension-seam work is in #70772.

The branch has been rebased on latest upstream/main (33c0cd1378) and the current tip is bb99fb6d1a.

Runtime Routing Map

Selecting GPT-5.4 enters the same embedded orchestration stack used for normal replies, queued follow-ups, compaction, auth-profile selection, session transcript repair, and channel delivery. openai/* and openai-codex/* still use the built-in Pi/OpenAI path. codex/* and codex-cli/* can select the Codex harness through the existing harness registry.

flowchart TD
  User["User selects model / reply target"] --> AutoReply["auto-reply runner / follow-up runner"]
  AutoReply --> Fallback["runWithModelFallback"]
  Fallback --> Embedded["runEmbeddedPiAgent / runEmbeddedAgent alias"]
  Embedded --> Backend["runEmbeddedAttemptWithBackend"]
  Backend --> Selection["harness selection"]
  Selection -->|openai/*, openai-codex/*| Pi["built-in Pi/OpenAI attempt"]
  Selection -->|codex/*, codex-cli/*| Codex["Codex harness / app-server lifecycle"]
  Pi --> Params["extra params + tool schema shaping"]
  Pi --> Session["session transcript + orphan repair"]
  Pi --> Auth["auth profile / provider alias selection"]
  Pi --> Delivery["visible reply / follow-up delivery"]
  Codex --> Delivery
  Delivery --> Channels["origin channel or visible fallback"]
Loading

Failure Classes Fixed

Area Before After Primary files
GPT-5.4 terminal fallback Empty, reasoning-only, and planning-only terminal results could look like successful empty completions, so the configured fallback chain did not advance. Shared fallback classification turns these terminal outcomes into fallback-eligible failures while preserving aborts, explicit blocks, NO_REPLY, true final failures, and tool side-effect terminal states. src/agents/model-fallback.ts, src/agents/pi-embedded-runner/result-fallback-classifier.ts, src/auto-reply/reply/agent-runner-execution.ts, src/auto-reply/reply/followup-runner.ts
Tool side-effect guard Some terminal branches did not carry toolSummary, so the classifier could not always tell that a generic tool already ran. toolSummary is built once from attempt.toolMetas and propagated through timeout, block, reasoning-only, incomplete-turn, and success metadata. src/agents/pi-embedded-runner/run.ts, src/agents/model-fallback.run-embedded.e2e.test.ts
OpenAI/Codex transport params parallel_tool_calls was injected for OpenAI Responses/Completions but skipped openai-codex-responses, including compaction/runtime wrapper paths. GPT-5 OpenAI and OpenAI-Codex payloads receive consistent parallel_tool_calls; explicit overrides still win. src/agents/provider-api-families.ts, src/agents/pi-embedded-runner/extra-params.ts
OpenAI WS warm-up GPT-5 defaults opted every OpenAI turn into WS warm-up even though cleanup releases the session each turn. Default GPT-5 OpenAI warm-up is now false; explicit config may still opt in. Pooling remains follow-up/gated work. src/agents/pi-embedded-runner/extra-params.ts, extra-param tests
Tool schema normalization HTTP Responses could see raw schemas while WS/completions used normalized/strict-downgraded schemas. Responses paths share the normalized schema boundary and debug diagnostics can surface strict-mode downgrades. src/agents/openai-tool-schema.ts, src/agents/openai-transport-stream.ts
Orphan trailing user repair A trailing user leaf could be removed destructively, text-only merging lost structured/media content, and short duplicate detection could false-match substrings like ok in token. Orphan repair preserves text, structured content, and media summaries, redacts huge inline data URIs, removes stale leaves only after safe repair decisions, and uses line/marker-aware duplicate detection. src/agents/pi-embedded-runner/run/attempt.prompt-helpers.ts, src/agents/pi-embedded-runner/run/attempt.ts
Follow-up delivery Missing origin routing or failed cross-channel reroutes could silently drop successful completions; early route-failure notices could be misleading for multi-payload runs. Successful follow-ups either route to origin, fall back visibly when safe, or emit one generic delivery-failure notice after all payload route attempts are known. src/auto-reply/reply/followup-runner.ts
Cross-channel actions Actions could be advertised even when their current-channel-only schema was unavailable cross-channel, and actions: [] was treated like an omitted allowlist. Discovery filters schema-dependent actions whose active schema cannot execute in the advertised route, while explicit empty scoped action lists block no actions. src/channels/plugins/message-action-discovery.ts, src/channels/plugins/message-actions.test.ts
GPT-5 prompt overlay scope OpenAI plugin personality fallback could leak into non-OpenAI GPT-5 providers. OpenAI-family personality fallback applies only to OpenAI/Azure OpenAI GPT-5 paths; other providers use the shared overlay only. src/agents/gpt5-prompt-overlay.ts, src/plugins/provider-runtime.ts
Auth profile aliases codex-cli/gpt-5.4, openai-codex/*, session overrides, CLI handoff, and embedded runner lock checks could compare different provider strings for the same auth profile family. Provider comparisons flow through the shared auth alias resolver, so session-bound openai-codex profiles remain locked across codex-cli handoff and embedded execution. src/agents/provider-auth-aliases.ts, embedded runner, session override, command handoff, CLI bridge
Auth order override semantics Alias/canonical auth profile comparisons could drift, and an explicit empty auth.order.<provider> = [] must still mean "use no stored profiles". Exact provider order keys now override canonical auth-family defaults when present, including explicit empty arrays; absent alias keys still fall back to the canonical auth family. src/agents/auth-profiles/order.ts, auth order tests

GPT-5.4 Fallback Flow

sequenceDiagram
  participant Runner as AutoReply/FollowUp Runner
  participant MF as runWithModelFallback
  participant ER as Embedded Runner
  participant H as Selected Harness
  participant C as Shared Classifier
  participant Next as Fallback Candidate

  Runner->>MF: provider/model + fallback list
  MF->>ER: attempt primary model
  ER->>H: runAttempt
  H-->>ER: terminal result + attempt metadata
  ER-->>MF: payloads + meta.toolSummary
  MF->>C: classify result
  alt empty/reasoning-only/planning-only and no side effects
    C-->>MF: FailoverError(format)
    MF->>Next: advance configured fallback
  else abort/block/visible reply/NO_REPLY/tool side effect
    C-->>MF: null
    MF-->>Runner: preserve normal terminal behavior
  end
Loading

Channel, Session, And Auth Delivery Flow

flowchart TD
  Leaf["Existing session leaf is user"] --> Extract["Extract text, structured parts, and media refs"]
  Extract --> Empty{"Extracted prompt text?"}
  Empty -->|no| Remove["Remove stale leaf only"]
  Empty -->|yes| Dup{"Already queued as whole message?"}
  Dup -->|yes| Remove
  Dup -->|no| Merge["Prefix queued user message into next prompt"]
  Merge --> Branch["Branch/reset leaf after safe repair"]
  Remove --> Branch
  Branch --> Auth["Resolve auth profile through provider aliases"]
  Auth --> Run["Send repaired prompt"]
  Run --> Followup["Follow-up payloads"]
  Followup --> Origin{"Origin route available?"}
  Origin -->|yes| Route["Try originating channel"]
  Route -->|all fail cross-channel| Notice["One generic local delivery-failure notice"]
  Route -->|same-provider failure| Dispatcher["Safe local dispatcher fallback"]
  Route -->|any success| Done["No misleading failure notice"]
  Origin -->|no| Dispatcher
Loading

Safety Boundaries

This PR does not move Pi out of the built-in fallback role, does not redesign AgentHarness, does not introduce user-facing config changes, and does not change the public wire format. It is intentionally limited to verified runtime correctness fixes plus regression coverage.

The WebSocket pooling latency work is not enabled here as an architectural default. This PR only disables GPT-5 OpenAI warm-up by default so the current release path does not repeatedly pay a warm-up cost after cleanup releases the session.

Related Work And Issue Map

This PR intentionally does not use Closes: for broad GPT-5.4/Codex tickets unless the exact reported scenario is covered. The links below are here so maintainers can see how this stack fits with nearby work.

Link Relationship
#41282 Historical openai-codex/GPT-5.4 timeout/stall report. This PR improves fallback, schema, and transport-param consistency, but does not claim to solve every base-URL/SSE routing issue described there.
#64251 CLI-backed codex-cli/gpt-5.4 follow-up instability. This PR helps by normalizing auth aliases and preventing successful follow-up payload drops.
#51063 / #65152 OpenAI-Codex tool execution/tool-definition symptoms. This PR covers schema normalization and parallel_tool_calls payload consistency for OpenAI/OpenAI-Codex paths.
#65844 / #57286 / #63856 OpenAI-Codex auth profile/order drift. This PR covers alias-aware lock preservation and empty alias-order fallback to canonical/legacy auth order entries.
#59928 / #65234 / #54698 Fallback-chain/session-model issues. This PR is narrower: it classifies GPT-5.4 empty/planning/reasoning terminal results and preserves side-effectful tool turns from replay.
#45761 / #60830 / #59680 Prior fallback classifier hardening. This PR builds on that line by adding GPT-5.4 embedded terminal classification and side-effect guards.
#52903 / #63608 Prior retry/session transcript integrity work. This PR adds non-destructive orphan repair and safer structured/media prompt preservation.
#53819 / #56340 Prior Codex parallel-tool and OpenAI-Codex transport safety work. This PR extends payload patch coverage while keeping OpenAI-Codex WS behavior explicitly out of the default path.
#70904 / #70911 / #63369 Adjacent reasoning-effort injection issue. Not fixed here; #70911 is the focused PR for missing body.reasoning when OpenAI/Codex Responses payloads start with reasoning: undefined.
#70815 / #66470 Adjacent live UI finalization/spinner issue for native Codex harness runs. Not fixed here; this PR focuses backend delivery/fallback semantics.
#69453 / #55461 / #42225 Adjacent GPT-5.4 context-window/catalog mismatch issues. Not fixed here.
#56487 / #50647 / #57917 Adjacent UI/model-picker provider-prefix issues. Not fixed here.

Live Search Additions (2026-04-24)

I re-ran live GitHub search across GPT-5.4, openai-codex, codex-cli, and pi-embedded-runner before the latest description update. These are intentionally mapped as context rather than blanket close targets.

Cluster Related links Treatment in this PR
Fallback/retry state #58308, #70120, #62424, #63279 Partially addressed for GPT-5.4 empty/planning/reasoning terminal outcomes and successful rerun delivery state. Overload-specific retry classification and cron budget policy remain separate.
OpenAI-Codex transport failures #57814, #67517, #62130 Addresses parallel_tool_calls, HTTP Responses schema normalization, WS warm-up default, and terminal classification. Does not claim to fix Cloudflare/base-url/network failures.
Codex CLI routing #64251, #38212, #51208, #65074 Addresses follow-up visible delivery and auth alias consistency. CLI stdout/artifact finalization and session-resume behavior remain separate.
Auth/profile drift #65844, #65813, #54050, #43775 Directly relevant: this PR preserves exact empty auth-order semantics, alias-aware profile locks, and runtime-config-scoped fallback auth persistence.
Embedded runner integrity #64570, #64888, #67878, #68329 Addresses GPT-5.4 thinking/reasoning-only fallback classification and orphan repair. Broader cancellation/liveness and CLI compaction remain separate.
Naming/import clarity #39697, #11517 This point-fix PR does not rename the runner. #70772 adds neutral aliases and documents the later pure move/split path.

Latest Validation

Post-rebase verification on the final branch:

  • Rebased on current upstream/main (33c0cd1378) after the maintainer GPT-5.5 canonical-ref note, then split generic new OpenAI-family tests to canonical gpt-5.5 while leaving gpt-5.4/codex-cli refs only as explicit regression or legacy-compat coverage.
  • node scripts/run-vitest.mjs run --config test/vitest/vitest.auto-reply.config.ts src/auto-reply/reply/agent-runner-execution.test.ts src/auto-reply/reply/followup-runner.test.ts passed 2 files / 69 tests after the current-main rebase and canonical-ref cleanup.
  • node scripts/run-vitest.mjs run --config test/vitest/vitest.agents.config.ts src/agents/openai-transport-stream.test.ts src/agents/pi-embedded-runner-extraparams.test.ts src/agents/model-fallback.test.ts src/agents/command/attempt-execution.cli.test.ts src/agents/agent-command.live-model-switch.test.ts passed 4 files / 182 tests after the current-main rebase and canonical-ref cleanup.
  • node scripts/run-vitest.mjs run --config test/vitest/vitest.plugins.config.ts src/plugins/provider-runtime.test.ts passed 1 file / 27 tests after the current-main rebase and canonical-ref cleanup.
  • node scripts/run-vitest.mjs run --config test/vitest/vitest.auto-reply.config.ts src/auto-reply/reply/agent-runner-execution.test.ts src/auto-reply/reply/followup-runner.test.ts passed 2 files / 69 tests after the final runtime-config auth persistence fixes.
  • node scripts/run-vitest.mjs run --config test/vitest/vitest.agents.config.ts src/agents/command/attempt-execution.cli.test.ts src/agents/pi-embedded-runner-extraparams.test.ts src/agents/pi-embedded-runner-extraparams-resolve.test.ts src/agents/model-fallback.test.ts src/agents/auth-profiles/order.test.ts src/agents/auth-profiles.resolve-auth-profile-order.uses-stored-profiles-no-config-exists.test.ts src/agents/auth-profiles/session-override.test.ts src/agents/provider-auth-aliases.test.ts src/agents/agent-command.live-model-switch.test.ts passed 7 files / 192 tests.
  • node scripts/run-vitest.mjs run --config test/vitest/vitest.auto-reply.config.ts src/auto-reply/reply/followup-runner.test.ts passed 1 file / 23 tests.
  • node scripts/run-vitest.mjs run --config test/vitest/vitest.e2e.config.ts src/agents/model-fallback.run-embedded.e2e.test.ts passed 1 file / 17 tests.

Earlier focused/broad local verification on this PR also covered:

  • pnpm lint
  • pnpm tsgo:core:test
  • node scripts/run-vitest.mjs run --config test/vitest/vitest.full-core-support-boundary.config.ts test/scripts/lint-suppressions.test.ts
  • node scripts/run-vitest.mjs run --config test/vitest/vitest.auto-reply.config.ts src/auto-reply/reply/agent-runner-execution.test.ts src/auto-reply/reply/followup-runner.test.ts
  • node scripts/run-vitest.mjs run --config test/vitest/vitest.agents.config.ts src/agents/model-fallback.test.ts src/agents/pi-embedded-runner/run/attempt.test.ts src/agents/pi-embedded-runner-extraparams.test.ts src/agents/openai-transport-stream.test.ts src/agents/auth-profiles/session-override.test.ts src/agents/auth-profiles/order.test.ts src/agents/command/attempt-execution.cli.test.ts src/agents/provider-auth-aliases.test.ts src/agents/tools/message-tool.test.ts src/agents/agent-command.live-model-switch.test.ts src/plugins/provider-runtime.test.ts
  • node scripts/run-vitest.mjs run --config test/vitest/vitest.channels.config.ts src/channels/plugins/message-actions.test.ts
  • OPENCLAW_VITEST_NO_OUTPUT_TIMEOUT_MS=0 node scripts/run-vitest.mjs run --config test/vitest/vitest.extension-messaging.config.ts
  • pnpm exec oxfmt --check <changed files>
  • git diff --check

Review State

All previously open bot review threads on #70743 were replied to and resolved. The final review-fix commits after the latest rebase are:

  • 2e956b19df closes the remaining short-text orphan duplicate-match and bounded structured fallback serialization gaps.
  • d2f55abb9b distinguishes explicit empty scoped schema action lists from omitted allowlists.
  • 961567766a preserves aliased embedded auth locks.
  • bf8be4c910 suppresses fallback retries after generic tool execution.
  • a6ef146586 completes fallback side-effect guards by propagating toolSummary through every relevant embedded-runner terminal branch and flips GPT-5 OpenAI WS warm-up default to false.
  • 35f7c348e9 updates the rebased CLI attempt-execution test mock for upstream's provider auth alias-map export.
  • 10b74a4459 addresses fresh bot review by keeping stripped NO_REPLY terminal turns out of fallback and preserving explicit empty auth-order overrides, including exact alias keys such as codex-cli: [].
  • f73022e4f4 addresses fresh follow-up routing review by emitting a visible partial-delivery notice when any cross-channel payload fails, even if another payload in the same completion routes successfully.
  • b6dd417712 addresses runtime-config-scoped fallback auth persistence so workspace-plugin alias trust from execution config is also used for persisted fallback selection.
  • 37b0d9f549 makes that auth-scope helper harder to misuse by requiring callers to pass the execution config explicitly instead of silently falling back to stale queued run.config.
  • bb99fb6d1a responds to the maintainer GPT-5.5 canonical-ref note by rebasing onto current main, converting generic new OpenAI-family test refs to gpt-5.5, and documenting remaining gpt-5.4/codex-cli refs as intentional regression or legacy-compat coverage.

Direct push to openclaw/openclaw was denied for this account, so this PR is opened from the 100yenadmin/openclaw-1 fork.

@openclaw-barnacle openclaw-barnacle Bot added docs Improvements or additions to documentation channel: discord Channel integration: discord channel: telegram Channel integration: telegram channel: voice-call Channel integration: voice-call gateway Gateway runtime cli CLI command changes scripts Repository scripts commands Command implementations docker Docker and sandbox tooling agents Agent runtime and tooling extensions: openai extensions: qa-lab extensions: codex size: XL labels Apr 23, 2026
@100yenadmin 100yenadmin marked this pull request as ready for review April 23, 2026 19:12
@100yenadmin 100yenadmin requested a review from a team as a code owner April 23, 2026 19:12
Copilot AI review requested due to automatic review settings April 23, 2026 19:12
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 23, 2026

Greptile Summary

This PR hardens multiple GPT-5.4 runtime paths: it adds a classifyResult hook to the model-fallback loop so that empty/reasoning-only/plan-only terminal results trigger configured fallbacks; it normalises tool-schema strict-mode handling and adds diagnostic logging across both the Completions and Responses transports; it preserves orphaned user payloads instead of silently removing them; and it fixes auth-profile alias resolution by routing provider comparisons through resolveProviderIdForAuth.

  • The GPT-5 terminal result classifier is implemented twice—once in result-fallback-classifier.ts (imported by followup-runner) and again locally in agent-runner-execution.ts—with divergent isGpt5ModelId regexes and case-sensitive vs. case-insensitive plan-only string matching; this creates a maintenance surface where fixing a detection gap in one path silently leaves the other unchanged.

Confidence Score: 4/5

Safe to merge with minor cleanup; the duplicate classifier logic is a maintainability risk but not an immediate production blocker.

All inline findings are P2, but the duplicated classification code in agent-runner-execution.ts uses a case-sensitive substring match for the plan-only signal while the canonical classifier uses a case-insensitive regex — a real behavioral divergence introduced in this PR that is worth resolving before the code drifts further.

src/auto-reply/reply/agent-runner-execution.ts — duplicate classifier with inconsistent regex; src/agents/pi-embedded-runner/result-fallback-classifier.ts — the intended canonical home for this logic

Prompt To Fix All With AI
This is a comment left during a code review.
Path: src/auto-reply/reply/agent-runner-execution.ts
Line: 122-125

Comment:
**Local `isGpt5ModelId` diverges from the canonical helper**

A local copy of `isGpt5ModelId` is defined with the pattern `^gpt-5(?:[.-]|$)`, while the canonical version imported by `result-fallback-classifier.ts` uses `(?:^|[/:])gpt-5(?:[.-]|$)`. The canonical regex also matches IDs that include a provider prefix (`openai/gpt-5.4`), which the local copy would miss. Since `result-fallback-classifier.ts` was introduced in this same PR as the shared classification helper, consider importing `isGpt5ModelId` from `gpt5-prompt-overlay.ts` here too instead of maintaining a second copy.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: src/auto-reply/reply/agent-runner-execution.ts
Line: 128-194

Comment:
**Duplicate GPT-5 terminal classification logic with subtle behavioral differences**

`classifyEmbeddedTerminalResultForFallback` here reimplements the same classification that `classifyEmbeddedPiRunResultForModelFallback` in `result-fallback-classifier.ts` provides, but with subtle divergences:

- Plan-only detection: this file uses `errorText.includes("Agent stopped after repeated plan-only turns")` (case-sensitive substring), while `result-fallback-classifier.ts` uses `/Agent stopped after repeated plan-only turns/i` (case-insensitive regex).
- Side-effect guards: this file additionally checks `directlySentBlockKeys.size > 0` and inspects `isSilentReplyText`/`isSilentReplyPrefixText`, which are absent from the shared classifier.

The mismatch in string matching means a lowercase variant of the plan-only message would trigger a fallback in `followup-runner` but not in `agent-runner-execution`. Consider unifying these by extending `classifyEmbeddedPiRunResultForModelFallback` to accept an optional `directlySentBlockKeys` context, or extracting the shared core into a common function both can delegate to.

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "fix: harden gpt-5.4 runtime paths" | Re-trigger Greptile

Comment thread src/auto-reply/reply/agent-runner-execution.ts Outdated
Comment thread src/auto-reply/reply/agent-runner-execution.ts Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Hardens several GPT‑5.4/Codex runtime paths across embedded Pi, CLI harnesses, provider transports, session delivery, and status/reporting so terminal “successful” but unusable results can advance through configured fallbacks and runtime surfaces stay consistent.

Changes:

  • Add result classification hooks to model fallback (and GPT‑5 terminal-result detection) so empty/reasoning-only/plan-only embedded runs can trigger fallbacks.
  • Normalize OpenAI/Codex transport/tool-schema behavior and improve status output (runner label + fast-mode label + thinking-default alignment).
  • Improve operational robustness across sessions, plugin install allowlists, channel catalogs, and Azure OpenAI image-generation routing.

Reviewed changes

Copilot reviewed 100 out of 100 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
test/official-channel-catalog.test.ts Makes catalog tests tolerant to additional official entries.
src/status/status-text.ts Uses explicit thinking default and aligns resolved think fallback order.
src/status/status-message.ts Adds Runner: line and simplifies fast-mode label to Fast.
src/status/status-message.test.ts Adds unit tests for fast-mode label behavior.
src/plugins/provider-runtime.ts Passes provider id into GPT‑5 prompt overlay resolution.
src/plugins/provider-runtime.test.ts Tests OpenAI personality fallback scoping for GPT‑5 providers.
src/plugins/provider-install-catalog.ts Relaxes trusted npmSpec exposure rules for config/bundled origins.
src/plugins/provider-install-catalog.test.ts Adds coverage for unpinned trusted npm specs.
src/plugins/contracts/boundary-invariants.test.ts Adds invariants to keep live provider config lookups on runtime config.
src/plugin-sdk/runtime-config-snapshot.ts Exposes clear/set runtime config snapshot APIs via SDK.
src/infra/package-dist-inventory.ts Centralizes legacy QA channel runtime-api path constant.
src/infra/npm-update-compat-sidecars.ts Exports legacy QA channel runtime-api sidecar path and reuses it.
src/gateway/model-pricing-cache.ts Extends pricing fetch timeout and defers bootstrap refresh via microtask.
src/gateway/model-pricing-cache.test.ts Tests deferred bootstrap refresh and updated timeout logging text.
src/gateway/gateway-codex-harness.live.test.ts Tweaks live probe instructions for shell-tool execution.
src/gateway/gateway-codex-harness.live-helpers.ts Recognizes missing codex PATH fallback text.
src/gateway/gateway-codex-harness.live-helpers.test.ts Adds tests for missing codex PATH fallback acceptance.
src/gateway/gateway-cli-backend.live.test.ts Adds request/agent timeouts and CI-safe Codex probe skip gate.
src/gateway/gateway-acp-bind.live.test.ts Refines retry/timeout handling for ACP bind live assertions.
src/config/sessions/store.ts Preserves active session key during maintenance; route updates preserve activity timestamps.
src/config/sessions/store-maintenance.ts Adds preserveKeys support to prune/cap maintenance operations.
src/config/sessions.test.ts Updates expectations and adds regression test for updateLastRoute not bumping updatedAt.
src/config/bundled-channel-config-metadata.generated.ts Updates generated channel schema metadata (toolProgress + execApprovals + policies).
src/commands/onboarding-plugin-install.ts Allows onboarding npm installs for trusted registry specs without strict pins.
src/commands/onboarding-plugin-install.test.ts Updates onboarding tests for relaxed npm spec requirements.
src/cli/plugins-install-persist.ts Adds installed plugin id to existing plugins.allow list before enabling.
src/cli/plugins-install-persist.test.ts Tests allowlist augmentation behavior during install persistence.
src/channels/plugins/message-action-discovery.ts Filters cross-channel advertised actions when schema visibility is current-channel only.
src/channels/plugins/contracts/channel-catalog.contract.test.ts Adds contract coverage for WeCom channel catalog entry.
src/channels/plugins/catalog.ts Loads built-in official external channel catalog JSON plus file-based official catalog.
src/auto-reply/thinking.ts Raises implicit reasoning defaults to “medium” and remaps to supported level.
src/auto-reply/thinking.test.ts Updates tests for new implicit default + remapping behavior.
src/auto-reply/status.test.ts Extends /status tests for runner label + fast-mode visibility rules.
src/auto-reply/reply/model-selection.ts Hydrates runtime catalog metadata for thinking when allowlist entries omit reasoning.
src/auto-reply/reply/model-selection.test.ts Adds tests for implicit thinking default + runtime hydration behavior.
src/auto-reply/reply/followup-runner.ts Classifies embedded results for fallback and avoids silently dropping followups.
src/auto-reply/reply/followup-runner.test.ts Updates tests for dispatcher fallback behavior when origin routing fails/is incomplete.
src/auto-reply/reply/commands-types.ts Adds resolvedFastMode to command handler params.
src/auto-reply/reply/commands-info.ts Forwards resolved fast mode into /status reply build.
src/auto-reply/reply/commands-info.test.ts Tests forwarding of resolved fast mode into /status.
src/auto-reply/reply/agent-runner-execution.ts Adds GPT‑5 embedded terminal-result fallback classification + auth-profile alias normalization.
src/auto-reply/reply/agent-runner-execution.test.ts Adds tests for plan-only/silent/streamed-block classification behavior.
src/auto-reply/reply/agent-runner-auth-profile.ts Normalizes provider ids for auth-profile scoping across aliases.
src/agents/tools/session-status-tool.ts Aligns status tool thinking-default resolution with configured/runtime catalogs.
src/agents/tools/message-tool.ts Uses cross-channel schema-supported action listing to avoid hidden params.
src/agents/tools/message-tool.test.ts Tests that cross-channel actions with current-channel-only schema are not advertised.
src/agents/provider-auth-aliases.ts Dedupes/centralizes alias selection and treats deprecated auth choice ids as aliases.
src/agents/provider-auth-aliases.test.ts Adds tests for deprecated auth choice ids mapping to provider auth key.
src/agents/pi-embedded-runner/run/attempt.ts Preserves orphaned user payloads and conditionally removes leaf based on preservation.
src/agents/pi-embedded-runner/run/attempt.test.ts Adds coverage for structured orphan preservation and removeLeaf behavior.
src/agents/pi-embedded-runner/run/attempt.prompt-helpers.ts Extracts structured content into prompt text; returns removeLeaf decision.
src/agents/pi-embedded-runner/result-fallback-classifier.ts New classifier for embedded Pi GPT‑5 terminal results.
src/agents/pi-embedded-runner/extra-params.ts Extends parallel_tool_calls injection to openai-codex-responses API.
src/agents/pi-embedded-runner-extraparams.test.ts Adds tests for Codex Responses parallel tool call injection.
src/agents/openclaw-tools.session-status.test.ts Adds session-status tool tests for implicit fallback keys and thinking defaults.
src/agents/openai-transport-stream.ts Normalizes tool parameters even when strict omitted; adds diagnostics when strict downgraded.
src/agents/openai-transport-stream.test.ts Adds tests for responses tool param normalization and strict downgrade behavior.
src/agents/openai-tool-schema.ts Adds strict schema diagnostics helpers and violation discovery.
src/agents/model-thinking-default.ts Switches thinking-default import to runtime thinking.ts.
src/agents/model-selection.test.ts Updates tests to reflect default thinking level changes.
src/agents/model-fallback.ts Adds optional result-classification hook to drive fallback progression on terminal “ok” results.
src/agents/model-fallback.test.ts Adds tests for fallback progression via result classification.
src/agents/gpt5-prompt-overlay.ts Scopes OpenAI plugin personality fallback to OpenAI-family GPT‑5 providers.
src/agents/command/attempt-execution.ts Normalizes provider ids for auth-profile override compatibility across aliases.
src/agents/command/attempt-execution.cli.test.ts Adds Codex CLI alias coverage for auth-profile propagation.
src/agents/auth-profiles/session-override.ts Matches session auth profiles to providers using auth-alias normalization.
src/agents/auth-profiles/session-override.test.ts Tests session override preservation under CLI provider aliasing.
scripts/write-official-channel-catalog.mjs Seeds generated catalog with built-in external entries + additional install metadata.
scripts/test-live-cli-backend-docker.sh Defaults CI-safe Codex config for codex-cli live tests and prints it.
scripts/test-built-bundled-channel-entry-smoke.mjs Improves error reporting when bundled entries fail to import.
scripts/lib/official-external-channel-catalog.json Adds built-in external catalog entry for WeCom plugin.
scripts/lib/docker-e2e-logs.sh Normalizes TMPDIR handling and mktemp template usage.
scripts/e2e/plugins-docker.sh Adjusts mktemp template usage for run logs.
scripts/e2e/parallels-macos-smoke.sh Improves Discord smoke visibility checks and config key naming.
scripts/e2e/npm-onboard-channel-agent-docker.sh Adjusts mktemp template usage for run logs.
extensions/voice-call/src/manager/store.ts Tracks pending append writes and adds test flush helper.
extensions/voice-call/src/manager/events.test.ts Awaits pending persist writes before removing test store directory.
extensions/telegram/src/bot/helpers.ts Adds bounded TTL cache for forum flag lookups.
extensions/telegram/src/bot/helpers.test.ts Adds cache reset and cache behavior tests for forum flag resolution.
extensions/telegram/src/bot.create-telegram-bot.test.ts Resets forum flag cache before bot tests.
extensions/telegram/src/bot-native-commands.test.ts Resets forum flag cache before native command tests.
extensions/qa-lab/src/live-transports/telegram/telegram-live.runtime.ts Records batch observation time for live Telegram probe RTT reporting.
extensions/openai/image-generation-provider.ts Adds Azure OpenAI image endpoint detection + auth/header/url shaping.
extensions/openai/image-generation-provider.test.ts Adds Azure image-generation routing tests (hosts + api-version override).
extensions/discord/subagent-hooks-api.ts Removes re-exports from subagent hooks API surface.
extensions/discord/src/monitor/thread-bindings.lifecycle.test.ts Switches to runtime-config snapshot SDK entrypoint.
extensions/discord/src/monitor/message-handler.ts Updates SDK import for runtime group policy resolution.
extensions/discord/src/monitor/message-handler.process.ts Splits SDK imports across new runtime entrypoints.
extensions/discord/src/monitor/message-handler.process.test.ts Updates mocks to new session-store runtime SDK module.
extensions/codex/provider.ts Uses runtime snapshot plugin config for discovery toggles.
extensions/codex/provider.test.ts Tests live config re-enabling discovery after startup disable.
docs/tools/thinking.md Documents /status fast-mode label behavior.
docs/tools/plugin.md Documents allowlist augmentation on plugin install.
docs/tools/image-generation.md Links to Azure OpenAI endpoint docs for image generation.
docs/providers/openai.md Adds Azure OpenAI image endpoint configuration documentation.
docs/plugins/sdk-setup.md Updates onboarding guidance for registry npm specs and optional integrity pins.
docs/plugins/manifest.md Updates manifest docs for unpinned npm specs + optional integrity.
docs/.generated/config-baseline.sha256 Updates config baseline hashes.
CHANGELOG.md Documents new runner label, thinking/fast-mode/status changes, and Azure image support.
.agents/skills/openclaw-release-maintainer/SKILL.md Updates release guidance around beta tag handling and release notes completeness.

Comment thread src/channels/plugins/catalog.ts
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f184258184

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/auto-reply/reply/agent-runner-execution.ts Outdated
Comment thread src/agents/pi-embedded-runner/run/attempt.prompt-helpers.ts Outdated
@100yenadmin 100yenadmin force-pushed the bugfix/gpt54-runtime-audit-v2026-4-22-v2 branch from f184258 to 3d385a3 Compare April 23, 2026 19:24
@openclaw-barnacle openclaw-barnacle Bot removed docs Improvements or additions to documentation channel: discord Channel integration: discord channel: telegram Channel integration: telegram channel: voice-call Channel integration: voice-call gateway Gateway runtime cli CLI command changes scripts Repository scripts labels Apr 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling channel: matrix Channel integration: matrix channel: msteams Channel integration: msteams extensions: codex extensions: openai size: XL

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants