fix: support DeepSeek v4-flash/v4-pro thinking mode toggle and fix reasoning_content 400 error#15228
fix: support DeepSeek v4-flash/v4-pro thinking mode toggle and fix reasoning_content 400 error#15228ruxme wants to merge 1 commit into
Conversation
|
Great fix, especially the thinking mode toggle and removing the tool_calls condition for DeepSeek — that is exactly the right approach. We independently hit the same issue and opened #15237 with a few complementary fixes you might want to consider adding:
Feel free to cherry-pick whatever makes sense. Happy to close #15237 if this covers the ground. |
…nt from NousResearch#15228 - _copy_reasoning_content_for_api: split DeepSeek into separate block, use setdefault instead of tool_calls-gated assignment, so empty reasoning_content is set on ALL assistant messages (not just tool-call) - chat_completions.py build_kwargs: add is_deepseek flag + thinking mode toggle (thinking: {type: enabled|disabled}) for native DeepSeek API - _build_api_kwargs: plumb _is_deepseek through to transport params - Keep our complementary fixes: isinstance checks in _extract_reasoning, is not None in chat_completions transport, deepseek- prefix
…asoning_content 400 error
Two fixes for DeepSeek direct API (api.deepseek.com) users:
1. Thinking mode toggle: DeepSeek v4-flash/v4-pro uses
`thinking: {"type": "enabled|disabled"}` which is different from
OpenRouter's `reasoning` extra_body format. Add `is_deepseek` flag
and handler to send the correct thinking parameter, allowing users
to control thinking mode via `reasoning_effort` config.
2. Fix reasoning_content 400 error: When thinking mode is enabled,
DeepSeek requires `reasoning_content` on EVERY assistant message
in the conversation, including plain-text final responses (not just
tool-call messages). Previously only tool-call messages were
patched, causing HTTP 400 errors on multi-turn conversations.
3. Complementary fixes suggested by duouoduo:
- _extract_reasoning: use isinstance(r, str) instead of truthy
checks to ensure empty strings are captured
- chat_completions normalization: change if reasoning_content:
to if reasoning_content is not None
- reasoning_model_prefixes: add deepseek- prefix
fc3f5c2 to
e276754
Compare
|
Thanks for the detailed review and suggestions! I've applied all three complementary fixes:
Updated commit pushed to the same branch. Appreciate the help! 🙏 |
Review feedback (multi-specialist + adversarial):
1. **Tighten DeepSeek detection.** Broad `"deepseek" in model.lower()` match
wrongly fires on Bedrock (`deepseek.v3.2`), NVIDIA NIM
(`deepseek-ai/deepseek-v3.2`), HuggingFace, and third-party fine-tunes.
These providers have NOT been validated to accept the `reasoning_content`
field — injecting empty placeholders there risks regressions on paths
that were previously fine. Switch to provider+host detection mirroring
the Kimi branch shape: OpenRouter-routed `deepseek/*` models OR direct
`api.deepseek.com`. No behavior change for OpenRouter or direct users;
untested hosts are left alone.
2. **Harden reasoning_config predicate.** Adjacent code at line 7342 uses
`isinstance(self.reasoning_config, dict)` before `.get()`. The new branch
was missing that guard — would have AttributeErrored on non-dict shapes
(string, Mock). Invert to an explicit "is False" disable check and guard
the isinstance.
3. **Add trailing `return` after the DeepSeek injection.** Matches the
pattern every prior branch in this function uses. Prevents a future
branch inserted below from unconditionally firing after DeepSeek.
4. **Expand test coverage** from 6 → 12 cases:
- `test_deepseek_injects_on_tool_call_message` — the entire point of
dropping the tool_calls gate per NousResearch#15228, was untested.
- `test_deepseek_reasoning_config_without_enabled_key` — common shape
`{"effort": "high"}` with no `enabled` field.
- `test_deepseek_reasoning_config_non_dict_is_safe` — pins the
defensive isinstance guard.
- `test_direct_deepseek_api_injects` — pins `api.deepseek.com` path.
- `test_bedrock_deepseek_skips_injection` — regression: Bedrock
`deepseek.v3.2` must NOT get injection.
- `test_nvidia_nim_deepseek_skips_injection` — same for NIM.
Full suite: 994 passed, 0 failed.
|
Superseded by PR #15407 (#15354's approach, which matched the existing Kimi Thanks @ruxme for the parallel work and for independently diagnosing the same issue. A couple of notes on why we went with the other approach:
Your |
…nt from NousResearch#15228 - _copy_reasoning_content_for_api: split DeepSeek into separate block, use setdefault instead of tool_calls-gated assignment, so empty reasoning_content is set on ALL assistant messages (not just tool-call) - chat_completions.py build_kwargs: add is_deepseek flag + thinking mode toggle (thinking: {type: enabled|disabled}) for native DeepSeek API - _build_api_kwargs: plumb _is_deepseek through to transport params - Keep our complementary fixes: isinstance checks in _extract_reasoning, is not None in chat_completions transport, deepseek- prefix
Summary
Two fixes for DeepSeek direct API (
api.deepseek.com) users who usedeepseek-v4-flash/deepseek-v4-prowith thinking mode.Changes
1. DeepSeek thinking mode toggle (
is_deepseekflag +thinkingextra_body)DeepSeek v4-flash/v4-pro uses a different API format than OpenRouter for controlling thinking mode. While OpenRouter uses
extra_body["reasoning"] = {"enabled": true, "effort": "medium"}, DeepSeek's native API expects:{"thinking": {"type": "enabled"}} {"thinking": {"type": "disabled"}}This PR adds an
is_deepseekflag threaded throughrun_agent.py->chat_completions.py, which injects the correctthinkingparameter. The thinking mode is controlled by the existingreasoning_effortconfig:reasoning_effort''(empty) /'medium'thinking: {"type": "enabled"}'none'thinking: {"type": "disabled"}2. Fix
reasoning_contentHTTP 400 error on multi-turn conversationsWhen thinking mode is enabled, DeepSeek requires
reasoning_contenton every assistant message, including plain-text responses. The existing_copy_reasoning_content_for_api()only patched tool-call messages (source_msg.get("tool_calls")), causing HTTP 400 errors on multi-turn conversations where the final assistant response has no tool calls.Fix: Remove the
and source_msg.get("tool_calls")condition for the DeepSeek branch, soreasoning_contentis set (defaulting to empty string) on all assistant messages.Files changed
run_agent.py—_is_deepseekflag, passis_deepseekto transport, fix_copy_reasoning_content_for_apifor all assistant messagesagent/transports/chat_completions.py— acceptis_deepseekparam, injectthinkingextra_bodyTesting
thinking: {"type": "enabled"}→ returnsreasoning_content✅thinking: {"type": "disabled"}→ noreasoning_content✅