fix: strip response-only reasoning fields from OpenAI Completions req…#82101
Conversation
…uests Prevents providers like OpenRouter from returning HTTP 500 errors when replayed assistant messages include fields such as `reasoning_details`.
|
Codex review: needs changes before merge. Summary Reproducibility: yes. Without running tests in this read-only review, the source path is clear: the PR deletes Real behavior proof Next step before merge Security Review findings
Review detailsBest possible solution: Make the sanitizer compat-aware enough to strip unsupported stock OpenAI replay fields while leaving provider-owned Do we have a high-confidence way to reproduce the issue? Yes. Without running tests in this read-only review, the source path is clear: the PR deletes Is this the best way to solve the issue? No. The updated PR is closer than the first commit, but the sanitizer is still too broad; the narrow maintainable fix is to preserve or defer provider-owned reasoning replay fields for Full review comments:
Overall correctness: patch is incorrect Acceptance criteria:
What I checked:
Likely related people:
Remaining risk / open question:
Codex review notes: model gpt-5.5, reasoning high; reviewed against 0e5f4ea18c29. |
|
Maintainer follow-up is pushed and verified. Changed:
Tested:
Thanks @sliverp. |

Summary
⚠️ Something went wrong while processing your request. Gateway logs showed OpenRouter returning HTTP 500 for the second and later requests, regardless of which model was selected (z-ai/glm-5.1, glm-5v-turbo, deepseek-v4-flash all reproduced).buildOpenAICompletionsParams, afterconvertMessagesandinjectToolCallThoughtSignatures, walk the outgoing message list and deletereasoning_details,reasoning_content, andreasoningfrom anyassistantmessage before it hits the wire.transport-message-transform.ts. Only the Chat Completions request builder is affected. User messages are untouched.Change Type (select all)
Scope (select all touched areas)
Linked Issue/PR
Real behavior proof (required for external PRs)
HTTP 500from OpenRouter on the Completions endpoint.pnpm buildand restart gateway.status=500from OpenRouter;prompt.submittedevents succeed back-to-back.reasoning_detailsand expect them on input (none known)./tmp/openclaw-bad-request-latest.json(84 KB); replaying it viacurlto OpenRouter reproduced the 500. Removing only thereasoning_detailsfield flipped the same request to 200.Root Cause (if applicable)
reasoning_details(array) andreasoning_content(string) inside responsechoices[].messageto expose the model's chain of thought. Downstream message builders preserved those fields when the assistant turn was persisted into the conversation history. On the next turn,convertMessages(pi-ai) serialized the assistant message back into the request body verbatim, so the request now carried e.g."reasoning_details": "<stringified array echo>". The Chat Completions input schema does not accept this field, and OpenRouter responds with a generic HTTP 500 instead of a 4xx parse error — which is why this looked like an upstream outage rather than a client-side bug.convertMessagesand the wire to drop response-only fields from input messages. We hadstripCompletionMessagesToRoleContentforstrictMessageKeysproviders, but the default path let anything through.How the reproducing curl was produced
To rule out a transport bug vs. a body bug, I added temporary instrumentation in
src/agents/provider-transport-fetch.tsthat wrote every outgoing request body to/tmp/openclaw-bad-request-latest.jsonalong with status, headers, and the response body. After triggering one bad turn from QQ, that file held the exact JSON body the gateway had POSTed.I then replayed it as-is:
Then I ran a binary search over the message list / fields:
messages[2](the assistant turn with the echoed reasoning) → 200.messages[2]but removereasoning_detailsonly → 200.messages[2]but coercereasoning_detailsfrom astringback to its originalarrayshape → 200.reasoning_details(string) and a minimal user turn → 500.That isolated
reasoning_details(as a stringified echo) as the sole trigger, and was the basis for the strip list (reasoning_details,reasoning_content,reasoning). The instrumentation was reverted; only the sanitizer ships.Regression Test Plan (if applicable)
src/agents/openai-transport-stream.test.ts— newdescribe("buildOpenAICompletionsParams strips response-only reasoning fields")block (2 cases).reasoning_details/reasoning_content/reasoning(any of the three) must produce a request whose corresponding message has none of those fields.usermessage must be left alone (they are user input, not response echo).buildOpenAICompletionsParamsexercises it deterministically without needing a network or a live provider. Higher-level tests would only flake or require provider mocks.User-visible / Behavior Changes
QQ-bot (and any other channel that hits the OpenAI Completions transport) no longer fails with
⚠️ Something went wrong …from the second turn onward. No config or default change.Diagram (if applicable)
Security Impact (required)
Yes, explain risk + mitigation: N/ARepro + Verification
Environment
strictMessageKeysoverrideSteps
Expected
Actual
Evidence
prompt.submittedafter).Human Verification (required)
curlreplay of the captured request body before vs. after strippingreasoning_details(500 → 200).stringandarrayshapes ofreasoning_detailsare handled (we delete the field outright).reasoning_detailsas input (none known).Review Conversations
Compatibility / Migration
Risks and Mitigations
reasoning_detailson input as part of a thought-replay protocol; stripping would silently lose context.strictMessageKeysis wired). Documented in the inline comment above the strip function.COMPLETIONS_RESPONSE_ONLY_MESSAGE_FIELDS; adding a name is a one-line change. The proper long-term fix is to make pi-ai'sconvertMessagesdrop these at the serializer level.