fix(agents): preserve reasoning replay from model metadata#88617
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 9d5adf94bc
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| const provider = model.provider.trim().toLowerCase(); | ||
| if (provider === "openai") { | ||
| return false; | ||
| } | ||
| return shouldPreserveOpenRouterReasoningReplay(model); |
There was a problem hiding this comment.
Gate metadata replay to compatible providers
For direct non-OpenAI providers with catalog reasoning: true but no reasoning_content replay contract, this now preserves assistant reasoning_content solely from metadata. The xAI plugin is one affected bundled path: its catalog marks reasoning Grok models as reasoning: true, while the provider still uses OPENAI_COMPATIBLE_REPLAY_HOOKS (extensions/xai/index.ts:180) to drop OpenAI-compatible reasoning history, and dropReasoningFromHistory intentionally preserves the current tool-call turn (src/agents/embedded-agent-runner/thinking.ts:331). After a Grok reasoning response that calls a tool, the follow-up tool-result request can therefore include assistant.reasoning_content, undoing the xAI/OpenRouter xAI replay safeguard and risking provider 400s on tool continuations; the metadata shortcut should be limited to the self-hosted/proxy routes that actually require pass-back.
Useful? React with 👍 / 👎.
|
Excellent, thanks Pete! Confirmed working now in the beta; Qwen3.6 sees its own thinking blocks. Coding working better and fewer KV cache flushes. |
Summary
reasoning: true.Fixes #88068.
Replaces #88071 with a no-new-config fix.
Verification
node scripts/run-vitest.mjs src/agents/transcript-policy.test.ts src/agents/embedded-agent-runner.sanitize-session-history.test.ts src/agents/openai-transport-stream.test.tspnpm exec oxfmt --check --threads=1 src/agents/transcript-policy.ts src/agents/transcript-policy.test.ts src/agents/embedded-agent-runner.sanitize-session-history.test.ts src/agents/openai-transport-stream.ts src/agents/openai-transport-stream.test.tsgit diff --check.agents/skills/autoreview/scripts/autoreview --mode localReal behavior proof
Behavior addressed: custom or self-hosted OpenAI-compatible reasoning models no longer lose replayed
reasoning_contentsolely because their model ID is not in a core whitelist.Real environment tested: local focused Vitest harness for transcript policy, session-history sanitization, and OpenAI-compatible request serialization.
Exact steps or command run after this patch:
node scripts/run-vitest.mjs src/agents/transcript-policy.test.ts src/agents/embedded-agent-runner.sanitize-session-history.test.ts src/agents/openai-transport-stream.test.tsEvidence after fix: the regression tests preserve prior thinking blocks through session sanitization and preserve outbound
reasoning_contentfor a custom Qwen-styleopenai-completionsmodel withreasoning: true.Observed result after fix: 3 test files passed, 330 tests passed.
What was not tested: live llama.cpp or vLLM server acceptance; this is covered with deterministic serialization and replay-policy tests rather than a live provider endpoint.