feat(deepseek): plumb reasoning_effort and thinking toggle to V4 API#14958
Closed
zkl2333 wants to merge 1 commit into
Closed
feat(deepseek): plumb reasoning_effort and thinking toggle to V4 API#14958zkl2333 wants to merge 1 commit into
zkl2333 wants to merge 1 commit into
Conversation
DeepSeek V4 exposes thinking mode as an explicit API contract rather
than via the legacy deepseek-chat / deepseek-reasoner model alias trick:
- extra_body.thinking = {"type": "enabled" | "disabled"} (default enabled)
- top-level reasoning_effort = "high" | "max"
(low/medium map to high, xhigh maps to max)
- Thinking mode forbids temperature, top_p, presence_penalty,
frequency_penalty.
Add an is_deepseek branch in the chat_completions transport that mirrors
the existing Kimi pattern but applies the DeepSeek-specific effort
mapping and strips the incompatible sampling params when thinking is
enabled. Wire detection in run_agent.py via the api.deepseek.com hostname.
Without this plumbing, users setting reasoning_effort on DeepSeek had no
effect: the value never reached the request, and thinking mode was only
selectable by routing through the soon-to-be-deprecated "deepseek-reasoner"
alias.
Reference: https://api-docs.deepseek.com/guides/thinking_mode
This was referenced Apr 24, 2026
6 tasks
This was referenced Apr 25, 2026
Closed
12 tasks
This was referenced May 4, 2026
2 tasks
Contributor
Author
|
Superseded by #24130 — same contract rebuilt on the |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Follow-up to #14934 / #14946 / #14952. With V4 ids resolvable end-to-end, the remaining gap is that
reasoning_effortis a no-op on the native DeepSeek provider — the value never reached the request, and thinking mode was only selectable via the soon-to-be-deprecateddeepseek-reasoneralias trick.Official V4 contract
DeepSeek V4 exposes thinking mode as an explicit API contract (https://api-docs.deepseek.com/guides/thinking_mode):
extra_body.thinking{"type": "enabled"}/{"type": "disabled"}enabledreasoning_effort"high"/"max""high"(auto"max"for complex agents)Thinking mode forbids
temperature,top_p,presence_penalty,frequency_penalty— these must be stripped when enabled.Hermes → DeepSeek effort mapping
Hermes uses a 5-level vocabulary (
low/medium/high/xhigh+ implicitmax). DeepSeek only accepts two values, so this PR maps:reasoning_effortreasoning_effortlowhighmediumhighhighhighxhighmaxmaxmaxThis matches the "compatibility mapping" DeepSeek already documents for external clients.
Changes
is_deepseekbranch, effort mapping, incompatible-param strip. Structure mirrors the existing Kimi branch one screen up._is_deepseekdetection (hostnameapi.deepseek.com) + plumb throughbuild_kwargs, symmetric with Kimi / OpenRouter / Nous / NVIDIA NIM detection.TestChatCompletionsDeepSeekcovering default-enabled thinking, explicit disable, effort mapping (all 5 inputs + case/whitespace tolerance), unknown-effort fallback, sampling-param stripping, temperature preservation when thinking disabled, and non-is_deepseekisolation.Test plan
pytest tests/agent/transports/test_chat_completions.py— 54 passed (13 new + 41 pre-existing)api.deepseek.com:deepseek-v4-pro+reasoning_effort=xhigh(→max) + thinking enabled → HTTP 200,reasoning_contentpopulated (167 chars, thinking mode actually ran),temperaturecorrectly omitteddeepseek-v4-flash+reasoning.enabled=False→ HTTP 200,reasoning_contentempty (thinking off),temperature=0.5preserved and accepted by the server