Skip to content

fix: strip orphaned OpenAI reasoning blocks before responses API call#55787

Merged
jalehman merged 4 commits intoopenclaw:mainfrom
suboss87:fix/openai-reasoning-orphan-blocks
Apr 19, 2026
Merged

fix: strip orphaned OpenAI reasoning blocks before responses API call#55787
jalehman merged 4 commits intoopenclaw:mainfrom
suboss87:fix/openai-reasoning-orphan-blocks

Conversation

@suboss87
Copy link
Copy Markdown
Contributor

Summary

  • Wire downgradeOpenAIReasoningBlocks() into the openai-responses / openai-codex-responses stream wrapper in attempt.ts
  • Prevents 400 errors when conversation history contains orphaned reasoning items

Root cause

OpenAI's Responses API requires every reasoning item to be immediately followed by a non-thinking content item (text or tool_call). After compaction, error recovery, or session restore, the conversation history can end up with orphaned reasoning blocks that lack their required following content. The API rejects these with:

400 Item 'rs_...' of type 'reasoning' was provided without its required following item.

downgradeOpenAIReasoningBlocks() already existed and was well-tested (5 unit tests) — it strips orphaned reasoning blocks from message history. However, it was only wired into the Google provider path (google.ts:641), not the OpenAI responses path. This fix adds it to the OpenAI path alongside the existing downgradeOpenAIFunctionCallReasoningPairs() call.

Changes

src/agents/pi-embedded-runner/run/attempt.ts — 2 changes:

  1. Import downgradeOpenAIReasoningBlocks (line 62)
  2. Chain it after downgradeOpenAIFunctionCallReasoningPairs in the responses API wrapper (line 1000)

Test plan

  • All 5 existing downgradeOpenAIReasoningBlocks tests pass
  • pnpm check passes (lint, format, types, boundaries)
  • Verify with gpt-5.2-chat or gpt-5.4-chat that tool calls no longer produce 400 errors after session history grows

Fixes #54534.

🤖 Generated with Claude Code

@openclaw-barnacle openclaw-barnacle Bot added agents Agent runtime and tooling size: XS labels Mar 27, 2026
@suboss87
Copy link
Copy Markdown
Contributor Author

suboss87 commented Mar 27, 2026

High-severity regression. Blocks all gpt-5.2-chat and gpt-5.4-chat users after the first tool call. The fix is 2 lines: import + call downgradeOpenAIReasoningBlocks() (which already exists and has 5 unit tests) in the openai-responses stream wrapper.

@steipete since you just landed d7b61228e2 (tighten openai ws reasoning replay) which fixed the WebSocket path, this is the matching fix for the REST responses path. Same class of bug, different code path.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Mar 27, 2026

Greptile Summary

This PR wires the already-tested downgradeOpenAIReasoningBlocks helper into the openai-responses/openai-codex-responses stream wrapper in attempt.ts, fixing 400 errors from the OpenAI Responses API when orphaned reasoning items appear in conversation history. The intent is correct and the unit-test coverage for the helper is solid.

Key finding:

  • The early-exit guard if (sanitized === messages) is now a dead branch. downgradeOpenAIReasoningBlocks always allocates and returns a new out array (it has no return changed ? out : messages guard), so sanitized can never be reference-equal to messages. Every stream call in the openai-responses path will now unnecessarily reconstruct the context object even when neither sanitizer changes anything. This is a real performance regression and the guard should be updated (or downgradeOpenAIReasoningBlocks should be made to return its input reference when unchanged, mirroring downgradeOpenAIFunctionCallReasoningPairs).
  • Minor: the call order here (pairreasoning) is the reverse of google.ts (reasoningpair); the two operations are independent so it doesn't affect correctness, but aligning the order would improve consistency.

Confidence Score: 4/5

Safe to merge after addressing the broken early-exit guard — the correctness fix for 400 errors is sound but introduces a performance regression on every stream call.

The core fix (chaining downgradeOpenAIReasoningBlocks) correctly solves the reported 400-error bug. However, the early-exit guard is now unconditionally bypassed because downgradeOpenAIReasoningBlocks always returns a new array, causing the no-op fast path to become dead code. This is a present defect on the changed path (unnecessary allocations and context reconstruction on every call), warranting a 4 rather than a 5.

src/agents/pi-embedded-runner/run/attempt.ts — the early-exit guard and call ordering need attention; src/agents/pi-embedded-helpers/openai.ts may also need a return changed ? out : messages guard added to downgradeOpenAIReasoningBlocks

Important Files Changed

Filename Overview
src/agents/pi-embedded-runner/run/attempt.ts Chains downgradeOpenAIReasoningBlocks after downgradeOpenAIFunctionCallReasoningPairs in the openai-responses stream wrapper. The fix is correct in intent, but the early-exit guard (sanitized === messages) is now permanently dead code because downgradeOpenAIReasoningBlocks always returns a new array — causing unnecessary context reconstruction on every API call.
Prompt To Fix All With AI
This is a comment left during a code review.
Path: src/agents/pi-embedded-runner/run/attempt.ts
Line: 1003-1005

Comment:
**Early-exit guard is now always false**

`downgradeOpenAIReasoningBlocks` always returns a freshly allocated `out` array (see `openai.ts:210, 269`), so `sanitized === messages` is **never** `true` after this change. The no-op fast path is now permanently dead code, meaning a new context object is reconstructed on every stream call even when neither sanitizer touched the history.

Contrast with `downgradeOpenAIFunctionCallReasoningPairs`, which explicitly returns the original `messages` reference when nothing changed (`return changed ? rewrittenMessages : messages`). `downgradeOpenAIReasoningBlocks` has no equivalent guard.

The simplest fix is to update the guard to properly track whether either sanitizer did real work:

```
          const pairSanitized = downgradeOpenAIFunctionCallReasoningPairs(
            messages as AgentMessage[],
          );
          const sanitized = downgradeOpenAIReasoningBlocks(pairSanitized);
          if (pairSanitized === messages && sanitized.length === pairSanitized.length && sanitized.every((m, i) => m === pairSanitized[i])) {
            return inner(model, context, options);
          }
```

Alternatively, update `downgradeOpenAIReasoningBlocks` in `openai.ts` to mirror `downgradeOpenAIFunctionCallReasoningPairs` and return the original input reference when nothing changed.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: src/agents/pi-embedded-runner/run/attempt.ts
Line: 997-1002

Comment:
**Call order is reversed compared to `google.ts`**

In `google.ts` (line 639-643), the two sanitizers are applied in the opposite order:
```typescript
downgradeOpenAIFunctionCallReasoningPairs(
  downgradeOpenAIReasoningBlocks(sanitizedCompactionUsage),
)
```
Here `downgradeOpenAIReasoningBlocks` runs first. In `attempt.ts` the order is flipped. Both orderings are functionally equivalent today (the two functions act on independent data — block content vs. tool-call IDs — and neither creates new orphans for the other to clean up), but the inconsistency could be confusing to future readers and makes it harder to extract a shared helper. Worth aligning with `google.ts` for consistency.

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "fix: strip orphaned OpenAI reasoning blo..." | Re-trigger Greptile

Comment on lines 1003 to 1005
if (sanitized === messages) {
return inner(model, context, options);
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Early-exit guard is now always false

downgradeOpenAIReasoningBlocks always returns a freshly allocated out array (see openai.ts:210, 269), so sanitized === messages is never true after this change. The no-op fast path is now permanently dead code, meaning a new context object is reconstructed on every stream call even when neither sanitizer touched the history.

Contrast with downgradeOpenAIFunctionCallReasoningPairs, which explicitly returns the original messages reference when nothing changed (return changed ? rewrittenMessages : messages). downgradeOpenAIReasoningBlocks has no equivalent guard.

The simplest fix is to update the guard to properly track whether either sanitizer did real work:

          const pairSanitized = downgradeOpenAIFunctionCallReasoningPairs(
            messages as AgentMessage[],
          );
          const sanitized = downgradeOpenAIReasoningBlocks(pairSanitized);
          if (pairSanitized === messages && sanitized.length === pairSanitized.length && sanitized.every((m, i) => m === pairSanitized[i])) {
            return inner(model, context, options);
          }

Alternatively, update downgradeOpenAIReasoningBlocks in openai.ts to mirror downgradeOpenAIFunctionCallReasoningPairs and return the original input reference when nothing changed.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/pi-embedded-runner/run/attempt.ts
Line: 1003-1005

Comment:
**Early-exit guard is now always false**

`downgradeOpenAIReasoningBlocks` always returns a freshly allocated `out` array (see `openai.ts:210, 269`), so `sanitized === messages` is **never** `true` after this change. The no-op fast path is now permanently dead code, meaning a new context object is reconstructed on every stream call even when neither sanitizer touched the history.

Contrast with `downgradeOpenAIFunctionCallReasoningPairs`, which explicitly returns the original `messages` reference when nothing changed (`return changed ? rewrittenMessages : messages`). `downgradeOpenAIReasoningBlocks` has no equivalent guard.

The simplest fix is to update the guard to properly track whether either sanitizer did real work:

```
          const pairSanitized = downgradeOpenAIFunctionCallReasoningPairs(
            messages as AgentMessage[],
          );
          const sanitized = downgradeOpenAIReasoningBlocks(pairSanitized);
          if (pairSanitized === messages && sanitized.length === pairSanitized.length && sanitized.every((m, i) => m === pairSanitized[i])) {
            return inner(model, context, options);
          }
```

Alternatively, update `downgradeOpenAIReasoningBlocks` in `openai.ts` to mirror `downgradeOpenAIFunctionCallReasoningPairs` and return the original input reference when nothing changed.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +997 to +1002
const pairSanitized = downgradeOpenAIFunctionCallReasoningPairs(
messages as AgentMessage[],
);
// Also strip orphaned reasoning blocks that lack a required following
// content item — OpenAI rejects these with a 400 error.
const sanitized = downgradeOpenAIReasoningBlocks(pairSanitized);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Call order is reversed compared to google.ts

In google.ts (line 639-643), the two sanitizers are applied in the opposite order:

downgradeOpenAIFunctionCallReasoningPairs(
  downgradeOpenAIReasoningBlocks(sanitizedCompactionUsage),
)

Here downgradeOpenAIReasoningBlocks runs first. In attempt.ts the order is flipped. Both orderings are functionally equivalent today (the two functions act on independent data — block content vs. tool-call IDs — and neither creates new orphans for the other to clean up), but the inconsistency could be confusing to future readers and makes it harder to extract a shared helper. Worth aligning with google.ts for consistency.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/pi-embedded-runner/run/attempt.ts
Line: 997-1002

Comment:
**Call order is reversed compared to `google.ts`**

In `google.ts` (line 639-643), the two sanitizers are applied in the opposite order:
```typescript
downgradeOpenAIFunctionCallReasoningPairs(
  downgradeOpenAIReasoningBlocks(sanitizedCompactionUsage),
)
```
Here `downgradeOpenAIReasoningBlocks` runs first. In `attempt.ts` the order is flipped. Both orderings are functionally equivalent today (the two functions act on independent data — block content vs. tool-call IDs — and neither creates new orphans for the other to clean up), but the inconsistency could be confusing to future readers and makes it harder to extract a shared helper. Worth aligning with `google.ts` for consistency.

How can I resolve this? If you propose a fix, please make it concise.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

@suboss87
Copy link
Copy Markdown
Contributor Author

suboss87 commented Mar 27, 2026

Addressed both Greptile findings in a1ac473:

  1. Early-exit guard fixed: downgradeOpenAIReasoningBlocks now returns the original input reference when no messages changed (return anyChanged ? out : messages), matching the pattern in downgradeOpenAIFunctionCallReasoningPairs. The sanitized === messages fast path is no longer dead code.

  2. Call order aligned with google.ts: reasoning blocks stripped first, then function-call pairs, consistent with the existing Google provider path.

All 5 reasoning block tests pass. pnpm check clean.

@suboss87 suboss87 force-pushed the fix/openai-reasoning-orphan-blocks branch 4 times, most recently from fd3c2cf to 7a652f4 Compare March 29, 2026 10:56
@suboss87 suboss87 force-pushed the fix/openai-reasoning-orphan-blocks branch from 7a652f4 to d2fd8d7 Compare April 11, 2026 09:31
@suboss87
Copy link
Copy Markdown
Contributor Author

Rebased on latest main to refresh CI. All review feedback from Greptile was addressed in commit a1ac473 (early-exit guard, call order alignment). Ready for maintainer review.

@suboss87
Copy link
Copy Markdown
Contributor Author

suboss87 commented Apr 14, 2026

@jalehman related to your recent #66167 (recovering reasoning-only OpenAI turns). This fixes the other half: orphaned reasoning blocks that survive compaction/pruning cause OpenAI responses API to reject with 400.

Fix: Call the existing downgradeOpenAIReasoningBlocks() (which was defined but never invoked) alongside the already-called downgradeOpenAIFunctionCallReasoningPairs() in the OpenAI responses wrapper.

All 32 CI checks green. Greptile feedback addressed.

@suboss87 suboss87 force-pushed the fix/openai-reasoning-orphan-blocks branch from d2fd8d7 to 17dbce8 Compare April 19, 2026 00:29
@jalehman jalehman self-assigned this Apr 19, 2026
@jalehman jalehman force-pushed the fix/openai-reasoning-orphan-blocks branch 3 times, most recently from 2fcd636 to ec3d05c Compare April 19, 2026 03:20
suboss87 and others added 4 commits April 19, 2026 00:55
Wire downgradeOpenAIReasoningBlocks into the openai-responses stream
wrapper alongside the existing function-call pairing sanitizer. This
prevents 400 errors when conversation history contains reasoning items
without their required following content block (e.g. after compaction,
error recovery, or session restore).

The function already existed and was tested but was only called in the
Google provider path, not the OpenAI responses path.

Fixes openclaw#54534.
…unchanged

Address Greptile review feedback:
1. Add anyChanged guard so the function returns the original input
   array when no messages were modified, matching the pattern used
   by downgradeOpenAIFunctionCallReasoningPairs. This preserves the
   early-exit fast path in the stream wrapper.
2. Align call order with google.ts: reasoning blocks first, then
   function-call pairs.
@jalehman jalehman force-pushed the fix/openai-reasoning-orphan-blocks branch from ec3d05c to 263b952 Compare April 19, 2026 07:56
@jalehman jalehman merged commit 6682b12 into openclaw:main Apr 19, 2026
29 of 30 checks passed
@jalehman
Copy link
Copy Markdown
Contributor

Merged via squash.

Thanks @suboss87!

jinon86 pushed a commit to jinon86/openclaw that referenced this pull request Apr 19, 2026
* test: stabilize standalone Parallels smoke lanes

* fix: strip orphaned OpenAI reasoning blocks before responses API call (openclaw#55787)

Merged via squash.

Prepared head SHA: 263b952
Co-authored-by: suboss87 <11032439+suboss87@users.noreply.github.com>
Co-authored-by: jalehman <550978+jalehman@users.noreply.github.com>
Reviewed-by: @jalehman

* fix: stabilize release smoke reruns

* test: tolerate empty fireworks live responses

* test: make OpenWebUI smoke deterministic

* tasks: extract detached task lifecycle runtime (openclaw#68886)

* tasks: extract detached task lifecycle runtime

* tests: relax gateway seam expectation

---------

Co-authored-by: Mariano Belinky <mariano@mb-server-643.local>

* test: complete workspace setup in update smokes

* fix: parse PowerShell cron tools allow-list (openclaw#68858) (thanks @chen-zhang-cs-code)

* fix(cron): parse PowerShell tools allow list

* fix(cron): clarify tools allow-list help

* fix: parse PowerShell cron tools allow-list (openclaw#68858) (thanks @chen-zhang-cs-code)

---------

Co-authored-by: Ayaan Zaidi <hi@obviy.us>

* fix(browser): discover CDP websocket from bare ws:// URL before attach (openclaw#68715)

* fix(browser): discover CDP websocket from bare ws:// URL before attach

When browser.cdpUrl is set to a bare ws://host:port (no /devtools/ path), ensureBrowserAvailable would call isChromeReachable -> canOpenWebSocket against the URL verbatim. Chrome only accepts WebSocket upgrades at the specific path returned by /json/version, so the handshake failed immediately with HTTP 400. With attachOnly: true, that surfaced as:

  Browser attachOnly is enabled and profile "openclaw" is not running.

even though the CDP endpoint was reachable and the profile was healthy. Reproduced by the new tests in chrome.test.ts and cdp.test.ts (openclaw#68027).

Fix: introduce isDirectCdpWebSocketEndpoint(url) — true only when a ws/wss URL has a /devtools/<kind>/<id> handshake path. Route any other ws/wss cdpUrl (including the bare ws://host:port shape) through HTTP /json/version discovery by normalising the scheme via the existing normalizeCdpHttpBaseForJsonEndpoints helper. Apply this in isChromeReachable, getChromeWebSocketUrl, and createTargetViaCdp. Direct WS endpoints with a /devtools/ path are still opened without an extra discovery round-trip.

Fixes openclaw#68027

* test(browser): add seeded fuzz coverage for CDP URL helpers

Adds property-based / seeded-fuzz tests for the URL helpers the
attachOnly CDP fix depends on (openclaw#68027):

  - isWebSocketUrl
  - isDirectCdpWebSocketEndpoint
  - normalizeCdpHttpBaseForJsonEndpoints
  - parseBrowserHttpUrl
  - redactCdpUrl
  - appendCdpPath
  - getHeadersWithAuth

Follows the existing repo convention (see
src/gateway/http-common.fuzz.test.ts): no fast-check dep, small
mulberry32 PRNG + hand-rolled generators, deterministic per-describe
seeds so failures are reproducible.

Lifts cdp.helpers.ts coverage from 77.77% -> 89.54% statements,
67.9% -> 80.24% branches, 78% -> 90% lines. Remaining uncovered
lines are inside the WS sender internals (createCdpSender,
withCdpSocket, fetchCdpChecked rate-limit branch), which require
integration-style mocks and are unrelated to the attachOnly fix.

* test(browser): drive cdp.helpers/cdp/chrome to 100% coverage

Lifts the three files touched by the openclaw#68027 attachOnly fix to 100% statements/branches/functions/lines across the extensions test suite. Adds cdp.helpers.internal.test.ts, cdp.internal.test.ts, and chrome.internal.test.ts covering error paths, branch matrices, CDP session helpers, Chrome spawn/launch/stop flows, and canRunCdpHealthCommand. Defensively unreachable guards are annotated with c8 ignore + inline justifications.

* fix(browser): restore WS fallback for non-/devtools ws:// CDP URLs

When /json/version discovery is unavailable (or returns no
webSocketDebuggerUrl), fall back to treating the original bare ws/wss
URL as a direct WebSocket endpoint. This preserves the openclaw#68027 fix for
Chrome's debug port while restoring compatibility with Browserless/
Browserbase-style providers that expose a direct WebSocket root without
a /json/version endpoint.

Priority order for bare ws/wss cdpUrl inputs:
  1. /devtools/<kind>/<id> URL \u2192 direct handshake, no discovery (unchanged)
  2. bare ws/wss root \u2192 try HTTP discovery first; if discovery returns a
     webSocketDebuggerUrl use it; otherwise fall back to the original URL
     as a direct WS endpoint
  3. HTTP/HTTPS URL \u2192 HTTP discovery only, no fallback (unchanged)

Affected call sites: isChromeReachable, getChromeWebSocketUrl,
createTargetViaCdp.

Also renames a misleading test ('still enforces SSRF policy for direct
WebSocket URLs') to accurately describe what it tests: SSRF enforcement
on the navigation target URL, not on the CDP endpoint.

New tests added for all three fallback paths. Coverage remains 100% on
all three touched files (238 tests).

* fix: browser attachOnly bare ws CDP follow-ups (openclaw#68715) (thanks @visionik)

* test(tasks): align detached runtime mock return types

* browser: route existing-session user profile through browser nodes (openclaw#68891)

* browser: route user profile through browser nodes

* browser: align existing-session node docs

* browser: preserve host fallback on node discovery errors

* browser: preserve configured node pin errors

* browser: widen config mock in node pin test

* fix: default kimi thinking to off (openclaw#68907)

Co-authored-by: termtek <termtek@ubuntu.tail2b72cd.ts.net>

* docs(changelog): note kimi thinking default fix

* docs(changelog): add thanks for kimi fix

* tasks: add detached runtime plugin registration contract (openclaw#68915)

* tasks: register detached runtime plugins

* tasks: harden detached runtime ownership

* tasks: extract detached runtime contract types

* changelog: note detached runtime contract

* changelog: attribute detached runtime contract

* feat(ui): add a2a operator dashboard shell

* fix(ui): align a2a baseline with current read contract

* feat(ui): add A2A case inspector panels

* CI: route small workflows to ubuntu-latest

* CI: route fork-blocked workflows to GitHub-hosted runners

* CI: fall back to GITHUB_TOKEN for fork automation

* CI: skip app-token steps when fork secrets are absent

* CI: gate fork app-token steps through env guards

* CI: fix stale fallback env guard

* fix(ui): sync A2A locale snapshots

* fix(ui): sync A2A locale metadata

---------

Co-authored-by: Peter Steinberger <steipete@gmail.com>
Co-authored-by: Subash Natarajan <suboss87@gmail.com>
Co-authored-by: suboss87 <11032439+suboss87@users.noreply.github.com>
Co-authored-by: jalehman <550978+jalehman@users.noreply.github.com>
Co-authored-by: Mariano <132747814+mbelinky@users.noreply.github.com>
Co-authored-by: Mariano Belinky <mariano@mb-server-643.local>
Co-authored-by: ZC <chenzhangcode@163.com>
Co-authored-by: Ayaan Zaidi <hi@obviy.us>
Co-authored-by: Viz <visionik@pobox.com>
Co-authored-by: Frank Yang <frank.ekn@gmail.com>
Co-authored-by: termtek <termtek@ubuntu.tail2b72cd.ts.net>
Co-authored-by: bangtong-ai <bangtong-ai@users.noreply.github.com>
Co-authored-by: OpenClaw Agent <openclaw-agent@users.noreply.github.com>
Co-authored-by: OpenClaw Bot <bot@openclaw.local>
Mquarmoc pushed a commit to Mquarmoc/openclaw that referenced this pull request Apr 20, 2026
…openclaw#55787)

Merged via squash.

Prepared head SHA: 263b952
Co-authored-by: suboss87 <11032439+suboss87@users.noreply.github.com>
Co-authored-by: jalehman <550978+jalehman@users.noreply.github.com>
Reviewed-by: @jalehman
lovewanwan pushed a commit to lovewanwan/openclaw that referenced this pull request Apr 28, 2026
…openclaw#55787)

Merged via squash.

Prepared head SHA: 263b952
Co-authored-by: suboss87 <11032439+suboss87@users.noreply.github.com>
Co-authored-by: jalehman <550978+jalehman@users.noreply.github.com>
Reviewed-by: @jalehman
ogt-redknie pushed a commit to ogt-redknie/OPENX that referenced this pull request May 2, 2026
…openclaw#55787)

Merged via squash.

Prepared head SHA: 263b952
Co-authored-by: suboss87 <11032439+suboss87@users.noreply.github.com>
Co-authored-by: jalehman <550978+jalehman@users.noreply.github.com>
Reviewed-by: @jalehman
github-actions Bot pushed a commit to Desicool/openclaw that referenced this pull request May 9, 2026
…openclaw#55787)

Merged via squash.

Prepared head SHA: 263b952
Co-authored-by: suboss87 <11032439+suboss87@users.noreply.github.com>
Co-authored-by: jalehman <550978+jalehman@users.noreply.github.com>
Reviewed-by: @jalehman
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling size: XS

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: 400 Item 'rs_0f6f7e706597875e0069c3fc42bb7881909157c43bde349e77' of type 'reasoning' was provided without its required following item.

2 participants