fix: strip orphaned OpenAI reasoning blocks before responses API call#55787
Conversation
|
High-severity regression. Blocks all gpt-5.2-chat and gpt-5.4-chat users after the first tool call. The fix is 2 lines: import + call @steipete since you just landed |
Greptile SummaryThis PR wires the already-tested Key finding:
Confidence Score: 4/5Safe to merge after addressing the broken early-exit guard — the correctness fix for 400 errors is sound but introduces a performance regression on every stream call. The core fix (chaining src/agents/pi-embedded-runner/run/attempt.ts — the early-exit guard and call ordering need attention; src/agents/pi-embedded-helpers/openai.ts may also need a
|
| Filename | Overview |
|---|---|
| src/agents/pi-embedded-runner/run/attempt.ts | Chains downgradeOpenAIReasoningBlocks after downgradeOpenAIFunctionCallReasoningPairs in the openai-responses stream wrapper. The fix is correct in intent, but the early-exit guard (sanitized === messages) is now permanently dead code because downgradeOpenAIReasoningBlocks always returns a new array — causing unnecessary context reconstruction on every API call. |
Prompt To Fix All With AI
This is a comment left during a code review.
Path: src/agents/pi-embedded-runner/run/attempt.ts
Line: 1003-1005
Comment:
**Early-exit guard is now always false**
`downgradeOpenAIReasoningBlocks` always returns a freshly allocated `out` array (see `openai.ts:210, 269`), so `sanitized === messages` is **never** `true` after this change. The no-op fast path is now permanently dead code, meaning a new context object is reconstructed on every stream call even when neither sanitizer touched the history.
Contrast with `downgradeOpenAIFunctionCallReasoningPairs`, which explicitly returns the original `messages` reference when nothing changed (`return changed ? rewrittenMessages : messages`). `downgradeOpenAIReasoningBlocks` has no equivalent guard.
The simplest fix is to update the guard to properly track whether either sanitizer did real work:
```
const pairSanitized = downgradeOpenAIFunctionCallReasoningPairs(
messages as AgentMessage[],
);
const sanitized = downgradeOpenAIReasoningBlocks(pairSanitized);
if (pairSanitized === messages && sanitized.length === pairSanitized.length && sanitized.every((m, i) => m === pairSanitized[i])) {
return inner(model, context, options);
}
```
Alternatively, update `downgradeOpenAIReasoningBlocks` in `openai.ts` to mirror `downgradeOpenAIFunctionCallReasoningPairs` and return the original input reference when nothing changed.
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: src/agents/pi-embedded-runner/run/attempt.ts
Line: 997-1002
Comment:
**Call order is reversed compared to `google.ts`**
In `google.ts` (line 639-643), the two sanitizers are applied in the opposite order:
```typescript
downgradeOpenAIFunctionCallReasoningPairs(
downgradeOpenAIReasoningBlocks(sanitizedCompactionUsage),
)
```
Here `downgradeOpenAIReasoningBlocks` runs first. In `attempt.ts` the order is flipped. Both orderings are functionally equivalent today (the two functions act on independent data — block content vs. tool-call IDs — and neither creates new orphans for the other to clean up), but the inconsistency could be confusing to future readers and makes it harder to extract a shared helper. Worth aligning with `google.ts` for consistency.
How can I resolve this? If you propose a fix, please make it concise.Reviews (1): Last reviewed commit: "fix: strip orphaned OpenAI reasoning blo..." | Re-trigger Greptile
| if (sanitized === messages) { | ||
| return inner(model, context, options); | ||
| } |
There was a problem hiding this comment.
Early-exit guard is now always false
downgradeOpenAIReasoningBlocks always returns a freshly allocated out array (see openai.ts:210, 269), so sanitized === messages is never true after this change. The no-op fast path is now permanently dead code, meaning a new context object is reconstructed on every stream call even when neither sanitizer touched the history.
Contrast with downgradeOpenAIFunctionCallReasoningPairs, which explicitly returns the original messages reference when nothing changed (return changed ? rewrittenMessages : messages). downgradeOpenAIReasoningBlocks has no equivalent guard.
The simplest fix is to update the guard to properly track whether either sanitizer did real work:
const pairSanitized = downgradeOpenAIFunctionCallReasoningPairs(
messages as AgentMessage[],
);
const sanitized = downgradeOpenAIReasoningBlocks(pairSanitized);
if (pairSanitized === messages && sanitized.length === pairSanitized.length && sanitized.every((m, i) => m === pairSanitized[i])) {
return inner(model, context, options);
}
Alternatively, update downgradeOpenAIReasoningBlocks in openai.ts to mirror downgradeOpenAIFunctionCallReasoningPairs and return the original input reference when nothing changed.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/pi-embedded-runner/run/attempt.ts
Line: 1003-1005
Comment:
**Early-exit guard is now always false**
`downgradeOpenAIReasoningBlocks` always returns a freshly allocated `out` array (see `openai.ts:210, 269`), so `sanitized === messages` is **never** `true` after this change. The no-op fast path is now permanently dead code, meaning a new context object is reconstructed on every stream call even when neither sanitizer touched the history.
Contrast with `downgradeOpenAIFunctionCallReasoningPairs`, which explicitly returns the original `messages` reference when nothing changed (`return changed ? rewrittenMessages : messages`). `downgradeOpenAIReasoningBlocks` has no equivalent guard.
The simplest fix is to update the guard to properly track whether either sanitizer did real work:
```
const pairSanitized = downgradeOpenAIFunctionCallReasoningPairs(
messages as AgentMessage[],
);
const sanitized = downgradeOpenAIReasoningBlocks(pairSanitized);
if (pairSanitized === messages && sanitized.length === pairSanitized.length && sanitized.every((m, i) => m === pairSanitized[i])) {
return inner(model, context, options);
}
```
Alternatively, update `downgradeOpenAIReasoningBlocks` in `openai.ts` to mirror `downgradeOpenAIFunctionCallReasoningPairs` and return the original input reference when nothing changed.
How can I resolve this? If you propose a fix, please make it concise.| const pairSanitized = downgradeOpenAIFunctionCallReasoningPairs( | ||
| messages as AgentMessage[], | ||
| ); | ||
| // Also strip orphaned reasoning blocks that lack a required following | ||
| // content item — OpenAI rejects these with a 400 error. | ||
| const sanitized = downgradeOpenAIReasoningBlocks(pairSanitized); |
There was a problem hiding this comment.
Call order is reversed compared to
google.ts
In google.ts (line 639-643), the two sanitizers are applied in the opposite order:
downgradeOpenAIFunctionCallReasoningPairs(
downgradeOpenAIReasoningBlocks(sanitizedCompactionUsage),
)Here downgradeOpenAIReasoningBlocks runs first. In attempt.ts the order is flipped. Both orderings are functionally equivalent today (the two functions act on independent data — block content vs. tool-call IDs — and neither creates new orphans for the other to clean up), but the inconsistency could be confusing to future readers and makes it harder to extract a shared helper. Worth aligning with google.ts for consistency.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/pi-embedded-runner/run/attempt.ts
Line: 997-1002
Comment:
**Call order is reversed compared to `google.ts`**
In `google.ts` (line 639-643), the two sanitizers are applied in the opposite order:
```typescript
downgradeOpenAIFunctionCallReasoningPairs(
downgradeOpenAIReasoningBlocks(sanitizedCompactionUsage),
)
```
Here `downgradeOpenAIReasoningBlocks` runs first. In `attempt.ts` the order is flipped. Both orderings are functionally equivalent today (the two functions act on independent data — block content vs. tool-call IDs — and neither creates new orphans for the other to clean up), but the inconsistency could be confusing to future readers and makes it harder to extract a shared helper. Worth aligning with `google.ts` for consistency.
How can I resolve this? If you propose a fix, please make it concise.Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
|
Addressed both Greptile findings in a1ac473:
All 5 reasoning block tests pass. |
fd3c2cf to
7a652f4
Compare
7a652f4 to
d2fd8d7
Compare
|
Rebased on latest main to refresh CI. All review feedback from Greptile was addressed in commit a1ac473 (early-exit guard, call order alignment). Ready for maintainer review. |
|
@jalehman related to your recent #66167 (recovering reasoning-only OpenAI turns). This fixes the other half: orphaned reasoning blocks that survive compaction/pruning cause OpenAI responses API to reject with 400. Fix: Call the existing All 32 CI checks green. Greptile feedback addressed. |
d2fd8d7 to
17dbce8
Compare
2fcd636 to
ec3d05c
Compare
Wire downgradeOpenAIReasoningBlocks into the openai-responses stream wrapper alongside the existing function-call pairing sanitizer. This prevents 400 errors when conversation history contains reasoning items without their required following content block (e.g. after compaction, error recovery, or session restore). The function already existed and was tested but was only called in the Google provider path, not the OpenAI responses path. Fixes openclaw#54534.
…unchanged Address Greptile review feedback: 1. Add anyChanged guard so the function returns the original input array when no messages were modified, matching the pattern used by downgradeOpenAIFunctionCallReasoningPairs. This preserves the early-exit fast path in the stream wrapper. 2. Align call order with google.ts: reasoning blocks first, then function-call pairs.
ec3d05c to
263b952
Compare
|
Merged via squash.
Thanks @suboss87! |
* test: stabilize standalone Parallels smoke lanes * fix: strip orphaned OpenAI reasoning blocks before responses API call (openclaw#55787) Merged via squash. Prepared head SHA: 263b952 Co-authored-by: suboss87 <11032439+suboss87@users.noreply.github.com> Co-authored-by: jalehman <550978+jalehman@users.noreply.github.com> Reviewed-by: @jalehman * fix: stabilize release smoke reruns * test: tolerate empty fireworks live responses * test: make OpenWebUI smoke deterministic * tasks: extract detached task lifecycle runtime (openclaw#68886) * tasks: extract detached task lifecycle runtime * tests: relax gateway seam expectation --------- Co-authored-by: Mariano Belinky <mariano@mb-server-643.local> * test: complete workspace setup in update smokes * fix: parse PowerShell cron tools allow-list (openclaw#68858) (thanks @chen-zhang-cs-code) * fix(cron): parse PowerShell tools allow list * fix(cron): clarify tools allow-list help * fix: parse PowerShell cron tools allow-list (openclaw#68858) (thanks @chen-zhang-cs-code) --------- Co-authored-by: Ayaan Zaidi <hi@obviy.us> * fix(browser): discover CDP websocket from bare ws:// URL before attach (openclaw#68715) * fix(browser): discover CDP websocket from bare ws:// URL before attach When browser.cdpUrl is set to a bare ws://host:port (no /devtools/ path), ensureBrowserAvailable would call isChromeReachable -> canOpenWebSocket against the URL verbatim. Chrome only accepts WebSocket upgrades at the specific path returned by /json/version, so the handshake failed immediately with HTTP 400. With attachOnly: true, that surfaced as: Browser attachOnly is enabled and profile "openclaw" is not running. even though the CDP endpoint was reachable and the profile was healthy. Reproduced by the new tests in chrome.test.ts and cdp.test.ts (openclaw#68027). Fix: introduce isDirectCdpWebSocketEndpoint(url) — true only when a ws/wss URL has a /devtools/<kind>/<id> handshake path. Route any other ws/wss cdpUrl (including the bare ws://host:port shape) through HTTP /json/version discovery by normalising the scheme via the existing normalizeCdpHttpBaseForJsonEndpoints helper. Apply this in isChromeReachable, getChromeWebSocketUrl, and createTargetViaCdp. Direct WS endpoints with a /devtools/ path are still opened without an extra discovery round-trip. Fixes openclaw#68027 * test(browser): add seeded fuzz coverage for CDP URL helpers Adds property-based / seeded-fuzz tests for the URL helpers the attachOnly CDP fix depends on (openclaw#68027): - isWebSocketUrl - isDirectCdpWebSocketEndpoint - normalizeCdpHttpBaseForJsonEndpoints - parseBrowserHttpUrl - redactCdpUrl - appendCdpPath - getHeadersWithAuth Follows the existing repo convention (see src/gateway/http-common.fuzz.test.ts): no fast-check dep, small mulberry32 PRNG + hand-rolled generators, deterministic per-describe seeds so failures are reproducible. Lifts cdp.helpers.ts coverage from 77.77% -> 89.54% statements, 67.9% -> 80.24% branches, 78% -> 90% lines. Remaining uncovered lines are inside the WS sender internals (createCdpSender, withCdpSocket, fetchCdpChecked rate-limit branch), which require integration-style mocks and are unrelated to the attachOnly fix. * test(browser): drive cdp.helpers/cdp/chrome to 100% coverage Lifts the three files touched by the openclaw#68027 attachOnly fix to 100% statements/branches/functions/lines across the extensions test suite. Adds cdp.helpers.internal.test.ts, cdp.internal.test.ts, and chrome.internal.test.ts covering error paths, branch matrices, CDP session helpers, Chrome spawn/launch/stop flows, and canRunCdpHealthCommand. Defensively unreachable guards are annotated with c8 ignore + inline justifications. * fix(browser): restore WS fallback for non-/devtools ws:// CDP URLs When /json/version discovery is unavailable (or returns no webSocketDebuggerUrl), fall back to treating the original bare ws/wss URL as a direct WebSocket endpoint. This preserves the openclaw#68027 fix for Chrome's debug port while restoring compatibility with Browserless/ Browserbase-style providers that expose a direct WebSocket root without a /json/version endpoint. Priority order for bare ws/wss cdpUrl inputs: 1. /devtools/<kind>/<id> URL \u2192 direct handshake, no discovery (unchanged) 2. bare ws/wss root \u2192 try HTTP discovery first; if discovery returns a webSocketDebuggerUrl use it; otherwise fall back to the original URL as a direct WS endpoint 3. HTTP/HTTPS URL \u2192 HTTP discovery only, no fallback (unchanged) Affected call sites: isChromeReachable, getChromeWebSocketUrl, createTargetViaCdp. Also renames a misleading test ('still enforces SSRF policy for direct WebSocket URLs') to accurately describe what it tests: SSRF enforcement on the navigation target URL, not on the CDP endpoint. New tests added for all three fallback paths. Coverage remains 100% on all three touched files (238 tests). * fix: browser attachOnly bare ws CDP follow-ups (openclaw#68715) (thanks @visionik) * test(tasks): align detached runtime mock return types * browser: route existing-session user profile through browser nodes (openclaw#68891) * browser: route user profile through browser nodes * browser: align existing-session node docs * browser: preserve host fallback on node discovery errors * browser: preserve configured node pin errors * browser: widen config mock in node pin test * fix: default kimi thinking to off (openclaw#68907) Co-authored-by: termtek <termtek@ubuntu.tail2b72cd.ts.net> * docs(changelog): note kimi thinking default fix * docs(changelog): add thanks for kimi fix * tasks: add detached runtime plugin registration contract (openclaw#68915) * tasks: register detached runtime plugins * tasks: harden detached runtime ownership * tasks: extract detached runtime contract types * changelog: note detached runtime contract * changelog: attribute detached runtime contract * feat(ui): add a2a operator dashboard shell * fix(ui): align a2a baseline with current read contract * feat(ui): add A2A case inspector panels * CI: route small workflows to ubuntu-latest * CI: route fork-blocked workflows to GitHub-hosted runners * CI: fall back to GITHUB_TOKEN for fork automation * CI: skip app-token steps when fork secrets are absent * CI: gate fork app-token steps through env guards * CI: fix stale fallback env guard * fix(ui): sync A2A locale snapshots * fix(ui): sync A2A locale metadata --------- Co-authored-by: Peter Steinberger <steipete@gmail.com> Co-authored-by: Subash Natarajan <suboss87@gmail.com> Co-authored-by: suboss87 <11032439+suboss87@users.noreply.github.com> Co-authored-by: jalehman <550978+jalehman@users.noreply.github.com> Co-authored-by: Mariano <132747814+mbelinky@users.noreply.github.com> Co-authored-by: Mariano Belinky <mariano@mb-server-643.local> Co-authored-by: ZC <chenzhangcode@163.com> Co-authored-by: Ayaan Zaidi <hi@obviy.us> Co-authored-by: Viz <visionik@pobox.com> Co-authored-by: Frank Yang <frank.ekn@gmail.com> Co-authored-by: termtek <termtek@ubuntu.tail2b72cd.ts.net> Co-authored-by: bangtong-ai <bangtong-ai@users.noreply.github.com> Co-authored-by: OpenClaw Agent <openclaw-agent@users.noreply.github.com> Co-authored-by: OpenClaw Bot <bot@openclaw.local>
…openclaw#55787) Merged via squash. Prepared head SHA: 263b952 Co-authored-by: suboss87 <11032439+suboss87@users.noreply.github.com> Co-authored-by: jalehman <550978+jalehman@users.noreply.github.com> Reviewed-by: @jalehman
…openclaw#55787) Merged via squash. Prepared head SHA: 263b952 Co-authored-by: suboss87 <11032439+suboss87@users.noreply.github.com> Co-authored-by: jalehman <550978+jalehman@users.noreply.github.com> Reviewed-by: @jalehman
…openclaw#55787) Merged via squash. Prepared head SHA: 263b952 Co-authored-by: suboss87 <11032439+suboss87@users.noreply.github.com> Co-authored-by: jalehman <550978+jalehman@users.noreply.github.com> Reviewed-by: @jalehman
…openclaw#55787) Merged via squash. Prepared head SHA: 263b952 Co-authored-by: suboss87 <11032439+suboss87@users.noreply.github.com> Co-authored-by: jalehman <550978+jalehman@users.noreply.github.com> Reviewed-by: @jalehman
Summary
downgradeOpenAIReasoningBlocks()into theopenai-responses/openai-codex-responsesstream wrapper inattempt.tsRoot cause
OpenAI's Responses API requires every
reasoningitem to be immediately followed by a non-thinking content item (text or tool_call). After compaction, error recovery, or session restore, the conversation history can end up with orphaned reasoning blocks that lack their required following content. The API rejects these with:downgradeOpenAIReasoningBlocks()already existed and was well-tested (5 unit tests) — it strips orphaned reasoning blocks from message history. However, it was only wired into the Google provider path (google.ts:641), not the OpenAI responses path. This fix adds it to the OpenAI path alongside the existingdowngradeOpenAIFunctionCallReasoningPairs()call.Changes
src/agents/pi-embedded-runner/run/attempt.ts— 2 changes:downgradeOpenAIReasoningBlocks(line 62)downgradeOpenAIFunctionCallReasoningPairsin the responses API wrapper (line 1000)Test plan
downgradeOpenAIReasoningBlockstests passpnpm checkpasses (lint, format, types, boundaries)Fixes #54534.
🤖 Generated with Claude Code