Fix live model inference edge cases by steipete · Pull Request #88946 · openclaw/openclaw

steipete · 2026-06-01T06:02:02Z

Summary

let CLI providers return the configured silent no-reply payload on clean empty output when the caller already allows silent replies, avoiding wrong-model fallback reruns
send explicit typed Responses input message items with input_text content so Azure AI Foundry project endpoints accept shared openai-responses payloads
reject HTML/non-JSON custom OpenAI-compatible verification responses and fail streamed 200 HTML runtime responses with a baseUrl /v1 hint
fix ACP/ACPX startup model propagation, including configured agent primary models and sessionOptions handoff
fix explicit model alias resolution before capability/thinking validation
keep thinking fallback downgrades turn-local so stored explicit session thinkingLevel overrides survive replies/runs
fix Codex nano/API-key runs by avoiding unsupported tool_search paths, including Codex v1 multi-agent deferral
fix Google simple-completion thinking payloads and Gemma 4 shorthand normalization
avoid showing configured fallback chains as active fallbacks when a session-selected model is pinned
clear stale pending live-model-switch state when sessions.patch resets the model pin
fix macOS Voice Wake/PTT/Talk Mode sends to inherit session thinking unless the UI selects an override
support custom OpenAI Responses-compatible onboarding for endpoints that expose /responses but not /chat/completions
align OpenAI Codex OAuth model selection so legacy codex/ primaries move to the selected canonical openai/ allowlist entry
fail closed when Codex native tool calls finish without matching tool results so trajectories keep durable failure proof
route baseUrl-only google-vertex providers through native Vertex streamGenerateContent while preserving explicit OpenAI-compatible Vertex endpoints
recover complete DeepSeek DSML tool-call text on OpenAI-compatible completions streams, including split chunks, without executing malformed DSML
backfill Azure/OpenAI Responses completed output items when providers send the final response without per-item stream events
preserve unescaped Windows path segments in streamed tool-call JSON arguments instead of decoding control characters
recover cron add/update tool-call parameters when local model parsers merge adjacent JSON property names
keep loopback MCP native-tool dedup exclusions out of inherited tool deny policy for claude-cli sessions
strip inbound metadata/delivery scaffolding from outbound message.send text and suppress metadata-only sends before channel dispatch
update command tests for the current model-selection resolver seam

Fixes #85806.
Fixes #83810.
Fixes #74305.
Fixes #87381.
Fixes #87740.
Fixes #84688.
Fixes #63685.
Fixes #88039.
Fixes #83192.
Fixes #87768.
Fixes #44870.
Fixes #88456.
Fixes #86808.
Fixes #84804.
Fixes #84697.
Fixes #84109.
Fixes #89008.
Fixes #85918.
Fixes #88833.
Fixes #88918.
Fixes #88439.
Fixes #89242.
Fixes #89241.

Partially addresses #89100 (FM-3 outbound scaffolding leak; FM-2 group target routing remains open).

Verification

node scripts/run-vitest.mjs src/llm/utils/json-parse.test.ts
node scripts/run-vitest.mjs src/agents/openai-transport-stream.test.ts -t "Azure Responses completed"
node scripts/run-vitest.mjs src/agents/openai-transport-stream.test.ts
node scripts/run-vitest.mjs src/agents/openai-transport-stream.test.ts -t "DeepSeek DSML"
node scripts/run-vitest.mjs src/agents/openai-transport-stream.test.ts -t "tool calls"
node scripts/run-vitest.mjs src/agents/openai-transport-stream.test.ts
node scripts/run-oxlint.mjs --tsconfig config/tsconfig/oxlint.core.json src/agents/cli-runner.ts src/agents/cli-runner/types.ts src/agents/cli-runner.reliability.test.ts
node scripts/run-vitest.mjs src/agents/cli-runner.before-agent-reply-cron.test.ts src/agents/cli-runner.context-engine.test.ts src/agents/cli-runner.reliability.test.ts src/auto-reply/reply/get-reply-run.media-only.test.ts src/auto-reply/reply/agent-runner.runreplyagent.e2e.test.ts
node scripts/run-oxlint.mjs --tsconfig config/tsconfig/oxlint.core.json src/agents/openai-transport-stream.ts src/agents/openai-transport-stream.test.ts
node scripts/run-vitest.mjs src/agents/openai-transport-stream.test.ts extensions/microsoft-foundry/index.test.ts src/agents/openai-responses-payload-policy.test.ts
node scripts/run-oxlint.mjs --tsconfig config/tsconfig/oxlint.core.json src/gateway/sessions-patch.ts src/gateway/sessions-patch.test.ts src/commands/onboard-custom.ts src/commands/onboard-custom.test.ts src/agents/provider-transport-fetch.ts src/agents/provider-transport-fetch.test.ts
node scripts/run-vitest.mjs src/gateway/sessions-patch.test.ts src/agents/live-model-switch.test.ts src/commands/onboard-custom.test.ts src/commands/onboard-custom-config.test.ts src/agents/provider-transport-fetch.test.ts
node scripts/run-vitest.mjs src/agents/agent-command.live-model-switch.test.ts src/agents/acp-spawn.test.ts src/agents/google-simple-completion-stream.test.ts src/agents/simple-completion-transport.test.ts src/auto-reply/reply/get-reply-run.media-only.test.ts src/auto-reply/status.test.ts extensions/acpx/src/runtime.test.ts extensions/google/model-id.test.ts extensions/google/provider-models.test.ts packages/model-catalog-core/src/provider-model-id-normalization.test.ts packages/model-catalog-core/src/provider-model-id-normalize.test.ts extensions/codex/src/app-server/dynamic-tool-build.test.ts extensions/codex/src/app-server/thread-lifecycle.binding.test.ts extensions/codex/src/app-server/thread-lifecycle.test.ts
node scripts/run-vitest.mjs extensions/codex/src/app-server/dynamic-tool-build.test.ts extensions/codex/src/app-server/thread-lifecycle.test.ts extensions/codex/src/app-server/thread-lifecycle.binding.test.ts extensions/codex/src/app-server/run-attempt.test.ts
node scripts/run-vitest.mjs src/gateway/sessions-patch.test.ts src/agents/live-model-switch.test.ts
node scripts/run-vitest.mjs src/commands/agent.test.ts
node scripts/run-vitest.mjs src/commands/onboard-custom-config.test.ts src/commands/onboard-custom.test.ts src/commands/onboard-non-interactive/local/auth-choice.test.ts
node scripts/run-vitest.mjs src/commands/configure.gateway-auth.prompt-auth-config.test.ts src/commands/model-picker.test.ts
node scripts/run-vitest.mjs extensions/codex/src/app-server/event-projector.test.ts extensions/codex/src/app-server/run-attempt.test.ts
node scripts/run-vitest.mjs extensions/google/api.test.ts extensions/google/provider-registration.test.ts extensions/google/index.test.ts src/agents/embedded-agent-runner/model.test.ts
pnpm exec oxfmt --check extensions/codex/src/app-server/event-projector.ts extensions/codex/src/app-server/event-projector.test.ts
node scripts/run-oxlint.mjs --tsconfig config/tsconfig/oxlint.extensions.json extensions/codex/src/app-server/event-projector.ts extensions/codex/src/app-server/event-projector.test.ts
node scripts/run-oxlint.mjs --tsconfig config/tsconfig/oxlint.extensions.json extensions/google/api.ts extensions/google/provider-policy.ts extensions/google/provider-registration.ts extensions/google/api.test.ts extensions/google/provider-registration.test.ts extensions/google/index.test.ts
node scripts/run-oxlint.mjs --tsconfig config/tsconfig/oxlint.core.json src/agents/embedded-agent-runner/model.provider-runtime.test-support.ts src/agents/embedded-agent-runner/model.test.ts
node scripts/run-tsgo.mjs -p tsconfig.extensions.json --incremental false
node scripts/run-tsgo.mjs -p test/tsconfig/tsconfig.extensions.test.json --incremental false
node scripts/run-oxlint.mjs --tsconfig config/tsconfig/oxlint.core.json src/commands/configure.gateway-auth.ts src/commands/configure.gateway-auth.prompt-auth-config.test.ts && git diff --check
node scripts/run-tsgo.mjs -p tsconfig.core.json --incremental false
node scripts/run-tsgo.mjs -p test/tsconfig/tsconfig.core.test.json --incremental false
node scripts/run-oxlint.mjs --tsconfig config/tsconfig/oxlint.core.json src/agents/agent-command.ts src/commands/agent-command.test-mocks.ts src/commands/agent.test.ts
node scripts/run-oxlint.mjs --tsconfig config/tsconfig/oxlint.core.json src/cli/program/register.onboard.ts src/commands/onboard-custom-config.ts src/commands/onboard-custom.ts src/commands/onboard-types.ts src/wizard/i18n/locales/en.ts src/wizard/i18n/locales/zh-CN.ts src/wizard/i18n/locales/zh-TW.ts src/commands/onboard-custom-config.test.ts src/commands/onboard-custom.test.ts src/commands/onboard-non-interactive/local/auth-choice.test.ts
pnpm docs:list
swift test --package-path apps/macos --filter VoiceWakeForwarderTests --filter TalkModeRuntimeSpeechTests --filter GatewayConnectionControlTests -> Swift Testing selected suites: 12 tests passed
isolated live API-key Codex harness: temp HOME/CODEX_HOME, CODEX_API_KEY from exported OPENAI_API_KEY, OPENCLAW_LIVE_CODEX_HARNESS_MODEL=openai/gpt-5.4-nano, node scripts/test-live.mjs --codex-harness -- src/gateway/gateway-codex-harness.live.test.ts -> 1 passed, 1 skipped, 154.60s
git diff --check origin/main...HEAD
node scripts/run-vitest.mjs src/agents/tools/cron-tool.test.ts
node scripts/run-vitest.mjs src/gateway/tool-resolution.exclude.test.ts src/gateway/tool-resolution.test.ts
node scripts/run-vitest.mjs src/agents/tools/sessions-spawn-tool.test.ts
node scripts/run-oxlint.mjs --tsconfig config/tsconfig/oxlint.core.json src/agents/tools/cron-tool-canonicalize.ts src/agents/tools/cron-tool.test.ts
node scripts/run-oxlint.mjs --tsconfig config/tsconfig/oxlint.core.json src/gateway/tool-resolution.ts src/gateway/tool-resolution.exclude.test.ts src/gateway/tool-resolution.test.ts src/agents/tools/sessions-spawn-tool.test.ts
node scripts/run-tsgo.mjs -p tsconfig.core.json --incremental false
git diff --check
/Users/steipete/Projects/agent-scripts/skills/autoreview/scripts/autoreview --mode local -> clean after accepted fixes
node scripts/run-vitest.mjs src/agents/tools/cron-tool.test.ts
node scripts/run-oxlint.mjs --tsconfig config/tsconfig/oxlint.core.json src/agents/tools/cron-tool-canonicalize.ts src/agents/tools/cron-tool.test.ts
node scripts/run-tsgo.mjs -p tsconfig.core.json --incremental false
git diff --check
/Users/steipete/Projects/agent-scripts/skills/autoreview/scripts/autoreview --mode local -> clean after accepted fixes

Note: local full extension test-type graph crashed inside typescript-go with a Go SIGSEGV before TypeScript diagnostics; core prod/test typechecks covering the touched files passed.

node scripts/run-vitest.mjs src/agents/tools/message-tool.test.ts src/infra/outbound/message-action-normalization.test.ts src/infra/outbound/message-action-runner.send-validation.test.ts src/auto-reply/reply/strip-inbound-meta.test.ts
node scripts/run-vitest.mjs src/infra/outbound/message-action-runner.core-send.test.ts
pnpm exec oxfmt --check --threads=1 src/agents/tools/message-tool.ts src/agents/tools/message-tool.test.ts src/auto-reply/reply/strip-inbound-meta.ts
node scripts/run-oxlint.mjs --tsconfig config/tsconfig/oxlint.core.json src/agents/tools/message-tool.ts src/agents/tools/message-tool.test.ts src/auto-reply/reply/strip-inbound-meta.ts src/auto-reply/reply/strip-inbound-meta.test.ts
/Users/steipete/Projects/agent-scripts/skills/autoreview/scripts/autoreview --mode local -> clean: no accepted/actionable findings reported

clawsweeper · 2026-06-01T06:03:30Z

Codex review: needs changes before merge. Reviewed June 1, 2026, 9:49 PM ET / 01:49 UTC.

Summary
The PR updates live model inference across provider routing, Responses payload/stream handling, Codex and ACPX runtime paths, session model/thinking state, macOS voice sends, custom onboarding, cron/message tools, docs, and tests.

PR surface: Source +823, Tests +1592, Docs +1, Other +67. Total +2483 across 76 files.

Reproducibility: yes. Source inspection of the latest PR head shows the Codex side-question path still uses model-unaware dynamic-tool loading, the Responses backfill still skips after prior reasoning content, Google routing still overrides explicit Generative AI API choices under google-vertex, and model reset clears the only live-switch flag.

Review metrics: 2 noteworthy metrics.

Public compatibility mode: 1 added (openai-responses). The new --custom-compatibility value changes the public onboarding/config contract and needs compatibility review before merge.
Known broad typecheck gap: 1 extension test-type graph crash reported. The PR body says the full extension test-type graph crashed before TypeScript diagnostics, so broad extension type coverage is not proven.

Merge readiness
Overall: 🧂 unranked krab
Proof: 🦞 diamond lobster
Patch quality: 🧂 unranked krab
Result: blocked by patch quality or review findings.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

[P2] Fix the four review findings with focused regression tests.
Refresh or replace the broad extension typecheck proof that crashed inside typescript-go.
Get maintainer owner review for the custom provider compatibility, auth/provider routing, and session-state upgrade behavior.

Risk before merge

[P1] Merging as-is can still fail Codex gpt-5.4-nano side-question threads because /btw dynamic tools keep the generic searchable loading mode.
[P1] Merging as-is can drop a final Responses assistant message or function call when a reasoning block arrives before a completed-response-only output item.
[P1] Merging as-is can route explicit google-generative-ai configs under a google-vertex provider through the Vertex transport, breaking existing non-Vertex Gemini/API Studio style setups.
[P1] Merging as-is can leave an active session running on the old pinned model after sessions.patch resets persisted selection to the default.
[P1] The PR adds a public custom-provider compatibility mode and changes provider/auth/session behavior across many surfaces, so owner review and upgrade proof remain important even after line-level fixes.
[P1] The PR body reports that the full extension test-type graph crashed inside typescript-go with a Go SIGSEGV before diagnostics, leaving a broad extension type-coverage gap.

Maintainer options:

Fix the remaining runtime blockers (recommended)
Apply the model-aware Codex loading, Responses backfill, Google transport, and live-switch reset repairs with focused regressions before merge.
Require owner upgrade review
After the line-level fixes, have maintainers explicitly review the custom provider compatibility mode, Google routing behavior, OpenAI/Codex auth selection, and session reset upgrade semantics.
Split the safest fixes
If the broad PR cannot converge quickly, pause this branch and split the already-proven narrow bug fixes into smaller owner-scoped PRs.

Copy recommended automerge instruction

@clawsweeper automerge

Special instructions:
Fix the current review findings in `extensions/codex/src/app-server/side-question.ts`, `src/agents/openai-transport-stream.ts`, `extensions/google/provider-registration.ts`, and `src/gateway/sessions-patch.ts`; add or update focused regression tests for each path; do not broaden the PR beyond these repairs.

Next step before merge

The remaining blockers are concrete file-level repairs an automated worker can attempt, but the protected label and broad provider/session surface still require maintainer review afterward.

Security
Cleared: The diff touches provider fetch/config behavior but does not add dependency, workflow, secret, package-resolution, or supply-chain changes with a concrete security regression.

Review findings

[P2] Apply nano tool loading to side questions — extensions/codex/src/app-server/run-attempt.ts:598
[P1] Backfill completed items after reasoning blocks — src/agents/openai-transport-stream.ts:1492-1493
[P1] Honor explicit Generative AI configs before Vertex fallback — extensions/google/provider-registration.ts:73-75

Review details

Best possible solution:

Fix the four concrete runtime blockers, keep the new custom OpenAI Responses mode documented as a public compatibility addition, and require maintainer review/upgrade proof for the provider, auth, and session-state behavior before merge.

Do we have a high-confidence way to reproduce the issue?

Yes. Source inspection of the latest PR head shows the Codex side-question path still uses model-unaware dynamic-tool loading, the Responses backfill still skips after prior reasoning content, Google routing still overrides explicit Generative AI API choices under google-vertex, and model reset clears the only live-switch flag.

Is this the best way to solve the issue?

No. The PR contains useful fixes and strong selected live proof, but it is not the best complete fix until the one-sided Codex path, completed-output backfill guard, Google transport precedence, and model-reset live-switch behavior are corrected.

Full review comments:

[P2] Apply nano tool loading to side questions — extensions/codex/src/app-server/run-attempt.ts:598
The main Codex run now switches gpt-5.4-nano to direct dynamic tools, but /btw side-question threads still build their bridge with resolveCodexDynamicToolsLoading(input.pluginConfig) in side-question.ts. That path can still expose deferred tool_search behavior for the same nano model this PR marks as unable to use tool search, so nano sessions can pass normal turns and fail when the user asks a side question.
Confidence: 0.88
[P1] Backfill completed items after reasoning blocks — src/agents/openai-transport-stream.ts:1492-1493
This guard skips completed-output backfill as soon as any content block exists. If a stream emits a reasoning item before response.completed.response.output supplies the actual assistant message or function call, output.content.length is already nonzero and the final deliverable item is never appended.
Confidence: 0.9
[P1] Honor explicit Generative AI configs before Vertex fallback — extensions/google/provider-registration.ts:73-75
This branch routes every google-vertex provider model with api: "google-generative-ai" through the Vertex transport solely because of the provider id. Configs that explicitly preserved the Generative AI API with an AI Studio or proxy base URL will now send Vertex-shaped requests to a non-Vertex endpoint.
Confidence: 0.86
[P1] Keep model resets pending until active runs reconcile — src/gateway/sessions-patch.ts:534
Deleting liveModelSwitchPending on model: null means an active run that is still on the old pinned model will not be reconciled, because shouldSwitchToLiveModel returns early when the flag is absent. Current docs say user-driven sessions.patch model changes mark a pending live switch, so clearing the flag here can leave persisted default selection and active runtime selection diverged.
Confidence: 0.84

Overall correctness: patch is incorrect
Overall confidence: 0.87

AGENTS.md: found and applied where relevant.

Codex review notes: model gpt-5.5, reasoning high; reviewed against b06dc1753765.

Label changes

Label justifications:

P2: The PR addresses normal-priority live model inference bugs across several surfaces, but it is not an emergency outage or security fix.
merge-risk: 🚨 compatibility: The diff adds a public custom-provider compatibility mode and changes provider routing behavior that can affect existing configs and upgrades.
merge-risk: 🚨 auth-provider: The diff changes OpenAI/Codex OAuth model selection plus Google/Azure provider transport selection.
merge-risk: 🚨 session-state: The diff changes persisted thinking overrides and live model switch session-state handling.
rating: 🧂 unranked krab: Overall readiness is 🧂 unranked krab; proof is 🦞 diamond lobster and patch quality is 🧂 unranked krab.
status: ⏳ waiting on author: ClawSweeper has contributor-facing work open and is waiting for author action. Sufficient (logs): The PR includes after-fix focused command output plus live Codex harness proof and live credentialed Azure Foundry canaries, though the remaining findings still need targeted regression proof after repair.
proof: sufficient: Contributor real behavior proof is sufficient. The PR includes after-fix focused command output plus live Codex harness proof and live credentialed Azure Foundry canaries, though the remaining findings still need targeted regression proof after repair.

Evidence reviewed

PR surface:

Source +823, Tests +1592, Docs +1, Other +67. Total +2483 across 76 files.

View PR surface stats

Area	Files	Added	Removed	Net
Source	39	999	176	+823
Tests	30	1624	32	+1592
Docs	2	2	1	+1
Config	0	0	0	0
Generated	0	0	0	0
Other	5	74	7	+67
Total	76	2699	216	+2483

Acceptance criteria:

[P1] node scripts/run-vitest.mjs extensions/codex/src/app-server/side-question.test.ts extensions/codex/src/app-server/dynamic-tool-build.test.ts extensions/codex/src/app-server/thread-lifecycle.test.ts.
[P1] node scripts/run-vitest.mjs src/agents/openai-transport-stream.test.ts -t "Azure Responses completed".
[P1] node scripts/run-vitest.mjs extensions/google/provider-registration.test.ts extensions/google/api.test.ts src/agents/embedded-agent-runner/model.test.ts.
[P1] node scripts/run-vitest.mjs src/gateway/sessions-patch.test.ts src/agents/live-model-switch.test.ts.
[P1] node scripts/run-tsgo.mjs -p tsconfig.core.json --incremental false.

What I checked:

Repository policy applied: Root AGENTS.md and relevant scoped guides for extensions, ACPX, agents, agent tools, gateway, gateway server methods, outbound helpers, and docs were read; the provider/session/config/Codex compatibility review rules apply. (AGENTS.md:21, b06dc1753765)
Protected label: The supplied live PR context includes the protected maintainer label, so conservative cleanup must keep the PR open for maintainer handling. (8c8e400f9c21)
Codex nano fix is one-sided: Latest PR head uses the new model-aware dynamic-tool loading resolver in the main Codex run path, but the side-question path still calls the generic resolver without the active model id. (extensions/codex/src/app-server/side-question.ts:602, 8c8e400f9c21)
Codex dependency contract checked: Codex upstream exposes deferred dynamic tools through tool_search only when the model supports search tools, so model-aware disabling must cover every OpenClaw Codex thread path. (../codex/codex-rs/core/src/tools/spec_plan.rs:275, c955f730781d)
Responses backfill guard still skips mixed streams: Latest PR head returns from completed-output backfill whenever any prior content exists, so a reasoning item added before response.completed.response.output still suppresses the final assistant message or function call. (src/agents/openai-transport-stream.ts:1492, 8c8e400f9c21)
Google transport override remains too broad: Latest PR head routes api: "google-generative-ai" models through Vertex when model.provider === "google-vertex", even if the explicit API choice and base URL require the Generative AI transport. (extensions/google/provider-registration.ts:74, 8c8e400f9c21)

Likely related people:

steipete: Recent current-main commits touch OpenAI Responses, Google Vertex/provider routing, session goals/state, and this PR branch also spans the same provider/session surfaces. (role: recent area contributor; confidence: high; commits: b23ace1d04ca, fba9eac7ebb7, 00d8d7ead059; files: src/agents/openai-transport-stream.ts, extensions/google/provider-registration.ts, src/gateway/sessions-patch.ts)
joshavant: Recent current-main Codex app-server commits touched native surfaces, sandbox execution, and missing turn completion around the same side-thread/runtime area. (role: adjacent Codex app-server owner; confidence: medium; commits: e0405ecc9bd6, ba06376c7955, 7cda26aa6c72; files: extensions/codex/src/app-server/side-question.ts, extensions/codex/src/app-server/thread-lifecycle.ts)
udaymanish6: GitHub path history shows recent work on Codex side-question timeout behavior, which is adjacent to the remaining side-question dynamic-tool-loading gap. (role: recent Codex side-thread contributor; confidence: medium; commits: 0f18d52f16e3; files: extensions/codex/src/app-server/side-question.ts)
latensified: Recent OpenAI Responses stream work changed replay/id handling in the same transport file where completed-output backfill is being added. (role: recent OpenAI Responses contributor; confidence: medium; commits: 6653193fdb90; files: src/agents/openai-transport-stream.ts)
1052326311: Recent current-main commits touched Google provider default API routing and gateway session patch auth-profile handling near two compatibility-sensitive surfaces in this PR. (role: recent provider/session contributor; confidence: medium; commits: b73e135f9730, 152f68d037af; files: extensions/google/provider-registration.ts, src/gateway/sessions-patch.ts)

What the crustacean ranks mean

🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works

ClawSweeper keeps one durable marker-backed review comment per issue or PR.
Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
Maintainers can also comment @clawsweeper review to request a fresh review only.
Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c653fb3f4e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-06-01T06:22:52Z

        }),
-        markLiveSwitchPending: true,
      });
+      delete next.liveModelSwitchPending;


Keep model resets pending for live switches

When sessions.patch resets model to null while an agent is already running on a user-pinned non-default model, this clears the only flag that shouldSwitchToLiveModel checks before reconciling the active run. The persisted selection changes back to the configured default, but the active runner will not restart/switch because liveModelSwitchPending is deleted here; this contradicts the documented live-switch contract that user-driven sessions.patch model changes mark a pending live switch (docs/concepts/model-failover.md:342).

Useful? React with 👍 / 👎.

pigfoot · 2026-06-02T01:17:21Z

I ran the missing live Azure credentialed canary on the latest head of this PR.

Setup:

PR: Fix live model inference edge cases #88946
Head tested: 810ef28557edcd7196e31b60a680ce03f817d71d
Runtime: isolated checkout under /tmp, isolated OPENCLAW_HOME / OPENCLAW_STATE_DIR / OPENCLAW_CONFIG_PATH
Production config was not modified.
Target: Azure Foundry/resource /openai/v1 Responses endpoint, redacted host
OpenClaw provider config shape:
- provider/model: azure/gpt-5.5
- api: azure-openai-responses
- base URL shape: ${AZURE_FOUNDRY_BASE_URL}/openai/v1
- credential source: ${AZURE_FOUNDRY_API_KEY}

No-tools canary:

prompt: Return exactly: AZURE_RESPONSES_CANARY_OK
final text: AZURE_RESPONSES_CANARY_OK
assistantTexts: ["AZURE_RESPONSES_CANARY_OK"]
winner: azure/gpt-5.5
fallbackUsed: false
stopReason: stop
durationMs: 53545

Tool-continuation canary:

prompt: First call the get_goal tool exactly once. After the tool result, return exactly: AZURE_RESPONSES_TOOL_CANARY_OK
final text: AZURE_RESPONSES_TOOL_CANARY_OK
assistantTexts: ["AZURE_RESPONSES_TOOL_CANARY_OK"]
toolSummary: { calls: 1, tools: ["get_goal"], failures: 0 }
winner: azure/gpt-5.5
fallbackUsed: false
stopReason: stop
durationMs: 22122

I also scanned the canary artifacts and session trajectory logs for the prior failure signatures:

negative-check: clean

The scan found no non_deliverable_terminal_turn, no FailoverError, no candidate_failed, no /v1 api-version rejection, and no Foundry Invalid value: '' typed-message rejection in these runs.

This is live credentialed Azure proof from the PR head that both a direct assistant-text turn and a tool-call continuation produce deliverable assistant text through azure-openai-responses without fallback.

steipete · 2026-06-02T03:03:10Z

Land-ready proof for head e02e62c6516a243b51bb6aeace78e906501ff546.

Local/source proof:

node scripts/run-vitest.mjs src/gateway/tool-resolution.exclude.test.ts src/gateway/tool-resolution.test.ts
node scripts/run-tsgo.mjs -p test/tsconfig/tsconfig.core.test.json --incremental false
node scripts/run-vitest.mjs src/agents/tools/message-tool.test.ts src/infra/outbound/message-action-normalization.test.ts src/infra/outbound/message-action-runner.send-validation.test.ts src/auto-reply/reply/strip-inbound-meta.test.ts
node scripts/run-vitest.mjs src/infra/outbound/message-action-runner.core-send.test.ts
pnpm exec oxfmt --check --threads=1 src/gateway/tool-resolution.exclude.test.ts src/agents/tools/message-tool.ts src/agents/tools/message-tool.test.ts src/auto-reply/reply/strip-inbound-meta.ts
node scripts/run-oxlint.mjs --tsconfig config/tsconfig/oxlint.core.json src/agents/tools/message-tool.ts src/agents/tools/message-tool.test.ts src/auto-reply/reply/strip-inbound-meta.ts src/auto-reply/reply/strip-inbound-meta.test.ts
node scripts/run-tsgo.mjs -p tsconfig.core.json --incremental false
git diff --check
Autoreview rerun after the message-tool sanitizer fix: no accepted/actionable findings.

Live/provider proof:

OpenAI/Anthropic/key-backed issue checks were verified during the sweep where applicable.
Anthropic OpenClaw Live Tests key via 1Password confirmed /v1/models includes claude-opus-4-8; stale Add Claude Opus 4.8 (claude-opus-4-8) to the model catalog #87746 was closed as fixed on main.

CI:

GitHub status rollup for this exact head reports no failed checks and no pending checks.

Known gap:

Sanitise outbound message.send tool arguments to prevent runtime scaffolding leak (FM-3) and chat_id routing bleed (FM-2) on weaker models #89100 is only partially addressed here: FM-3 outbound metadata leakage is fixed, but the FM-2 group target-routing repro remains open because the candidate runtime guard had semantic holes and was not clear/narrow enough to land.

openclaw-barnacle Bot added agents Agent runtime and tooling extensions: acpx extensions: codex extensions: google size: L maintainer Maintainer-authored PR labels Jun 1, 2026

steipete mentioned this pull request Jun 1, 2026

[Bug]: using codex harness with gpt-5.4-nano causes errors #85806

Closed

openclaw-barnacle Bot added gateway Gateway runtime app: macos App: macos labels Jun 1, 2026

steipete mentioned this pull request Jun 1, 2026

[Bug]: push to talk mac os companion app hard codes thinking low #87768

Closed

openclaw-barnacle Bot added the commands Command implementations label Jun 1, 2026

chatgpt-codex-connector Bot reviewed Jun 1, 2026

View reviewed changes

openclaw-barnacle Bot added the docs Improvements or additions to documentation label Jun 1, 2026

fix(cron): recover concatenated tool keys

810ef28

steipete mentioned this pull request Jun 2, 2026

[Bug]: cron tool: local llamacpp model parameter serialization corrupts JSON property names (key concatenation) #88439

Closed

clawsweeper Bot added rating: 🦐 gold shrimp Decent PR readiness signal, but merge confidence is limited. and removed rating: 🦪 silver shellfish Thin PR readiness signal; proof, validation, or implementation needs work. labels Jun 2, 2026

clawsweeper Bot mentioned this pull request Jun 2, 2026

fix(stream): handle cumulative JSON chunks from local llama.cpp tool calls #89070

Closed

fix(gateway): keep loopback excludes out of inherited denies

9e1e9aa

clawsweeper Bot mentioned this pull request Jun 2, 2026

fix(cli): prevent empty_response failover for completed thinking-only turns #89027

Closed

2 tasks

clawsweeper Bot added rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. and removed rating: 🦐 gold shrimp Decent PR readiness signal, but merge confidence is limited. labels Jun 2, 2026

pigfoot mentioned this pull request Jun 2, 2026

fix: support Azure Responses text stream events #89001

Closed

fix(message): strip inbound metadata from outbound sends

8c8e400

steipete mentioned this pull request Jun 2, 2026

Sanitise outbound message.send tool arguments to prevent runtime scaffolding leak (FM-3) and chat_id routing bleed (FM-2) on weaker models #89100

Open

5 tasks

test(gateway): type loopback exclude mock

e02e62c

steipete merged commit 9ead0ae into main Jun 2, 2026
167 checks passed

steipete deleted the inference branch June 2, 2026 03:03

Alix-007 mentioned this pull request Jun 2, 2026

fix(llm): preserve Windows path escapes in streamed args #88926

Closed

clawsweeper Bot mentioned this pull request Jun 2, 2026

fix(agents): cap DSML recovery buffer to prevent unbounded memory growth #86637

Open

SebTardif mentioned this pull request Jun 2, 2026

fix(status): exclude session-selected model from fallback display list #88049

Closed

This was referenced Jun 4, 2026

fix(llm): repairJson injects control chars for backslash b/f/n/r/t into Windows paths #88940

Closed

Resolve explicit model aliases before validation #84333

Closed

steipete mentioned this pull request Jun 7, 2026

fix(simple-completion): sanitize Google thinking payload for unknown Gemini aliases (fixes #84688) #84781

Closed

clawsweeper Bot mentioned this pull request Jun 13, 2026

fix(cron): recover from local-llamacpp parameter serialization bugs #88460

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix live model inference edge cases#88946

Fix live model inference edge cases#88946
steipete merged 30 commits into
mainfrom
inference

steipete commented Jun 1, 2026 •

edited

Loading

Uh oh!

clawsweeper Bot commented Jun 1, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Jun 1, 2026

Uh oh!

pigfoot commented Jun 2, 2026

Uh oh!

steipete commented Jun 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

steipete commented Jun 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Verification

Uh oh!

clawsweeper Bot commented Jun 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 1, 2026

Choose a reason for hiding this comment

Uh oh!

pigfoot commented Jun 2, 2026

Uh oh!

steipete commented Jun 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

steipete commented Jun 1, 2026 •

edited

Loading

clawsweeper Bot commented Jun 1, 2026 •

edited

Loading