feat: add DeepSeek-V4 thinking mode via unified thinking_mode parameter#15577
feat: add DeepSeek-V4 thinking mode via unified thinking_mode parameter#15577heming-gmh wants to merge 1 commit into
Conversation
Add support for DeepSeek V4-Pro/V4-Flash thinking mode when using
api.deepseek.com directly.
Design intent:
Introduce a single string-valued parameter to replace
the pattern of adding one boolean flag per provider. Each provider
uses exactly one branch in the transport's build_kwargs:
thinking_mode='kimi' → reasoning_effort + extra_body.thinking
thinking_mode='deepseek' → reasoning_effort + extra_body.thinking
(different effort values: high/max)
thinking_mode=None → falls back to is_kimi / supports_reasoning
for backward compatibility
For DeepSeek:
- Sets top-level reasoning_effort: "max" (default) or "high"
- Sets extra_body: {thinking: {type: "enabled"}}
- Automatically removes temperature when thinking is active
(DeepSeek docs: temperature/top_p unsupported with thinking)
- Can be disabled via reasoning_config: {enabled: false}
Testing:
- 49 tests pass (41 existing + 8 new for DeepSeek + unified mode)
- Real API call verified against api.deepseek.com with
deepseek-v4-pro model, confirmed reasoning_content returned
|
Timing note: DeepSeek V4-Pro and V4-Flash were released on April 24/25, 2026 — just yesterday. This PR adds first-class thinking mode support for the DeepSeek direct API (api.deepseek.com), which requires the Without this change, users of Tests verified against the live API ✅ (4282 tests passed, 2 pre-existing failures unrelated). Happy to make any adjustments the maintainers suggest! |
|
Thanks for the overview, @alt-glitch! I'd be happy to collaborate with @Tranquil-Flow and the others to merge these overlapping implementations into a cohesive solution. My PR introduces a unified I see that #15446 has additional valuable coverage (1M context windows, reasoning replay fixes, max-iterations path, more extensive tests) that mine doesn't address yet. I'd love to work with Tranquil-Flow to integrate the unified |
|
@alt-glitch Thanks for the review and for flagging the competing PRs. I see #16448 takes the minimal approach — it's clean and solves the immediate problem. My PR (#15577) aims for the same functional goal but introduces a unified thinking_mode string parameter to prevent future provider-specific boolean flag proliferation. Each new thinking-capable provider (Kimi, DeepSeek, and any future ones) requires exactly one elif branch — no new boolean flags, no scattered detection. The two approaches are complementary: if #16448 lands first as the hotfix, I'm happy to rebase mine on top of it as the long-term architectural abstraction. Let me know which direction you'd prefer. |
Summary
Add support for DeepSeek V4-Pro / V4-Flash thinking mode when using the DeepSeek direct API (api.deepseek.com).
Design Intent
Introduce a single string-valued
thinking_modeparameter to replace the pattern of adding one boolean flag per provider. Each provider uses exactly one branch in the transport'sbuild_kwargs:Adding a new thinking-capable provider requires only one elif in run_agent.py + one elif in chat_completions.py -- no flag proliferation.
DeepSeek-specific behavior
Files changed
Testing