Skip to content

fix(agents): preserve reasoning replay from model metadata#88617

Merged
steipete merged 1 commit into
mainfrom
fix/reasoning-metadata-replay
May 31, 2026
Merged

fix(agents): preserve reasoning replay from model metadata#88617
steipete merged 1 commit into
mainfrom
fix/reasoning-metadata-replay

Conversation

@steipete

Copy link
Copy Markdown
Contributor

Summary

  • Preserve OpenAI-compatible replay reasoning when the selected model metadata already has reasoning: true.
  • Keep stock OpenAI, Gemma 4, and OpenRouter Anthropic/xAI replay safeguards from receiving unsupported reasoning fields.
  • Document the transcript hygiene behavior for custom/self-hosted reasoning models.

Fixes #88068.
Replaces #88071 with a no-new-config fix.

Verification

  • node scripts/run-vitest.mjs src/agents/transcript-policy.test.ts src/agents/embedded-agent-runner.sanitize-session-history.test.ts src/agents/openai-transport-stream.test.ts
  • pnpm exec oxfmt --check --threads=1 src/agents/transcript-policy.ts src/agents/transcript-policy.test.ts src/agents/embedded-agent-runner.sanitize-session-history.test.ts src/agents/openai-transport-stream.ts src/agents/openai-transport-stream.test.ts
  • git diff --check
  • .agents/skills/autoreview/scripts/autoreview --mode local

Real behavior proof

Behavior addressed: custom or self-hosted OpenAI-compatible reasoning models no longer lose replayed reasoning_content solely because their model ID is not in a core whitelist.
Real environment tested: local focused Vitest harness for transcript policy, session-history sanitization, and OpenAI-compatible request serialization.
Exact steps or command run after this patch: node scripts/run-vitest.mjs src/agents/transcript-policy.test.ts src/agents/embedded-agent-runner.sanitize-session-history.test.ts src/agents/openai-transport-stream.test.ts
Evidence after fix: the regression tests preserve prior thinking blocks through session sanitization and preserve outbound reasoning_content for a custom Qwen-style openai-completions model with reasoning: true.
Observed result after fix: 3 test files passed, 330 tests passed.
What was not tested: live llama.cpp or vLLM server acceptance; this is covered with deterministic serialization and replay-policy tests rather than a live provider endpoint.

@openclaw-barnacle openclaw-barnacle Bot added docs Improvements or additions to documentation agents Agent runtime and tooling size: S maintainer Maintainer-authored PR labels May 31, 2026
@steipete steipete merged commit cf315dd into main May 31, 2026
163 of 171 checks passed
@steipete steipete deleted the fix/reasoning-metadata-replay branch May 31, 2026 12:41

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9d5adf94bc

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +3530 to +3534
const provider = model.provider.trim().toLowerCase();
if (provider === "openai") {
return false;
}
return shouldPreserveOpenRouterReasoningReplay(model);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Gate metadata replay to compatible providers

For direct non-OpenAI providers with catalog reasoning: true but no reasoning_content replay contract, this now preserves assistant reasoning_content solely from metadata. The xAI plugin is one affected bundled path: its catalog marks reasoning Grok models as reasoning: true, while the provider still uses OPENAI_COMPATIBLE_REPLAY_HOOKS (extensions/xai/index.ts:180) to drop OpenAI-compatible reasoning history, and dropReasoningFromHistory intentionally preserves the current tool-call turn (src/agents/embedded-agent-runner/thinking.ts:331). After a Grok reasoning response that calls a tool, the follow-up tool-result request can therefore include assistant.reasoning_content, undoing the xAI/OpenRouter xAI replay safeguard and risking provider 400s on tool continuations; the metadata shortcut should be limited to the self-hosted/proxy routes that actually require pass-back.

Useful? React with 👍 / 👎.

@syncword

Copy link
Copy Markdown

Excellent, thanks Pete! Confirmed working now in the beta; Qwen3.6 sees its own thinking blocks. Coding working better and fewer KV cache flushes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling docs Improvements or additions to documentation maintainer Maintainer-authored PR size: S

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: No config key to override dropReasoningFromHistory for openai-completions providers

2 participants