fix(agent): preserve reasoning_content replay on DeepSeek v4 + Kimi/Moonshot thinking#18045
Merged
Conversation
Builds on #16855 (@lsdsjy) which fixed DeepSeek v4 reasoning_content replay via model_extra fallback + capturing tool_calls at method entry. Kimi / Moonshot thinking mode enforces the same echo-back contract and hits the same 400 when a tool-call turn is persisted without reasoning_content. - _build_assistant_message: pad branch now uses _needs_thinking_reasoning_pad() (DeepSeek OR Kimi) instead of _needs_deepseek_tool_reasoning() alone. - Extract _needs_thinking_reasoning_pad() and reuse it in _copy_reasoning_content_for_api so both sites share one predicate. - tests/run_agent/test_deepseek_reasoning_content_echo.py: add TestBuildAssistantMessagePadsStrictProviders parametrized over DeepSeek (attr=None, attr-absent), Kimi (attr=None), Moonshot (via base_url), and an OpenRouter negative control that must NOT pad. Proven to fail 2/5 cases on Kimi/Moonshot without this change. - scripts/release.py: add AUTHOR_MAP entries for lsdsjy and season179. Refs #17400. Co-authored-by: season179 <season.saw@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
DeepSeek v4 thinking mode (and Kimi / Moonshot thinking) stop 400'ing on multi-turn tool-call replays with "The reasoning_content in the thinking mode must be passed back to the API." Fixes #17400.
Root cause
run_agent.py::_build_assistant_messagehad a pad branch guarded bymsg.get("tool_calls"), which was always falsy becausetool_callswere assigned ~60 lines later in the same method. When DeepSeek returnedreasoning_content=Noneon a tool-call turn and streaming captured no thinking text, the turn was persisted bare; the next replay hit the 400. Same enforcement exists on Kimi / Moonshot, reachable through the same code path. A secondary hole: when the OpenAI SDK doesn't know a provider's schema (aggregator passthrough like OpenCode Go → DeepSeek),reasoning_contentlands inmodel.model_extrainstead of a typed attribute and the builder never sees it.Changes
Salvages two open PRs:
assistant_tool_callsat method entry so the pad check reads the SDK source of truth, falls back tomodel.model_extra["reasoning_content"]when the typed attr is absent (covers aggregator paths like OpenCode Go), and mirrors themodel_extrafallback in thechat_completionstransport normalizer. Usesreasoning_text or ""so captured streaming reasoning is preserved when padding._needs_thinking_reasoning_pad()helper that's reused in_copy_reasoning_content_for_api(dedupes thedeepseek or kimipredicate across both sites).Follow-ups added here:
scripts/release.py: AUTHOR_MAP entries forlsdsjyandseason179._ATTR_ABSENT,_EXPECT_NOT_PRESENT,_sdk_tool_call,_build_sdk_message) from fix(agent): pad reasoning_content on DeepSeek/Kimi tool-call turns #17489 added alongside fix(deepseek): preserve chat completions reasoning_content replay #16855'sTestBuildAssistantMessageDeepSeekReasoningContent.Closes #16855. Closes #17489. Closes #17400.
Validation
test_deepseek_reasoning_content_echo.pytest_deepseek_reasoning_content_echo.pytest_deepseek_reasoning_content_echo.py+test_chat_completions.pytests/run_agent/ tests/agent/transports/The targeted empirical check (stash + rerun) proves the new Kimi/Moonshot cases exercise the extension on top of #16855, not trivially pass. The 3 DeepSeek parametrized cases pass in both scenarios because they were already fixed by the #16855 cherry-pick.
Credits
Co-authored-by: lsdsjy luwinyang@deepseek.com
Co-authored-by: season179 season.saw@gmail.com