fix(deepseek): subclass ProviderProfile so reasoning_effort + thinking reach the API#25301
Closed
Unveiling9559 wants to merge 1 commit into
Closed
Conversation
…g reach the API
The DeepSeek plugin shipped a bare ProviderProfile with no overrides for
build_api_kwargs_extras or build_extra_body. Once register_provider() is
called, ChatCompletionsTransport.build_kwargs takes the profile path
(chat_completions.py: `if _profile: return ...` short-circuit) and never
reaches the legacy is_kimi-style flag handling that lives below it. So
with the profile returning ({}, {}) from build_api_kwargs_extras, neither
top-level reasoning_effort nor extra_body.thinking is emitted.
Net effect: `agent.reasoning_effort` is dead air for any user with
`provider: deepseek`. DeepSeek's API still applies its server-side
defaults (effort=high, thinking=enabled) so requests succeed, but the
config value is inert.
Subclass ProviderProfile with a real build_api_kwargs_extras that emits
both fields the DeepSeek native API documents at
https://api-docs.deepseek.com/guides/thinking_mode:
- top-level reasoning_effort (passed verbatim; DeepSeek's server maps
low/medium -> high and xhigh -> max per the docs footnote)
- extra_body.thinking = {"type": "enabled"|"disabled"}
The hermes "minimal" effort is remapped to "low" because DeepSeek's API
rejects "minimal" with HTTP 400 ("unknown variant `minimal`, expected
one of `high`, `low`, `medium`, `max`, `xhigh`"). When reasoning_config
is None, both fields are omitted so users without `agent.reasoning_effort`
set get DeepSeek's API defaults unchanged. When thinking is disabled,
reasoning_effort is omitted to avoid the documented 400 conflict
("thinking options type cannot be disabled when reasoning_effort is
set").
Stale `aliases=("deepseek-chat",)` and `fallback_models=("deepseek-chat",
"deepseek-reasoner")` are also updated — the real models DeepSeek
currently serves are deepseek-v4-pro / deepseek-v4-flash. Verified via
GET https://api.deepseek.com/v1/models on 2026-05-13.
Live API verification (2026-05-13):
reasoning_effort=xhigh + extra_body.thinking={"type":"enabled"}
-> 121 reasoning tokens (vs ~50 at server defaults)
3-sub-turn tool-call flow (get_date -> get_weather -> answer):
all 200 OK, reasoning_content round-tripped via existing
_copy_reasoning_content_for_api logic.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Collaborator
Contributor
|
Automated hermes-sweeper review: this DeepSeek profile-path fix is already implemented on current main by the canonical maintainer PR #26648. Evidence:
Thanks for the clear diagnosis here; the same root fix has landed on main. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
ProviderProfile, sobuild_api_kwargs_extrasreturns({}, {})andagent.reasoning_effortnever reaches the API.reasoning_effort(top-level) andextra_body.thinkingper DeepSeek's documented thinking-mode contract.aliases/fallback_models(deepseek-chat/deepseek-reasoner→deepseek-v4-pro/deepseek-v4-flash).Why
The transport's profile path (
agent/transports/chat_completions.py, theif _profile: return _build_kwargs_from_profile(...)short-circuit) means registered providers bypass the legacyis_kimi/is_tokenhub-style flag handling. Since DeepSeek is registered as a bare profile, it inherits the no-op basebuild_api_kwargs_extras— soagent.reasoning_effort: xhighconfig is silently dropped on the floor forprovider: deepseek. DeepSeek's API still answers (server-side default ofeffort=high, thinking=enabled), so the bug looks like "reasoning seems to work" but the config value is inert.How
Match the OpenRouter / Kimi-coding / Qwen-OAuth pattern: subclass
ProviderProfileand overridebuild_api_kwargs_extrasto return(extra_body_additions, top_level_kwargs).Per the live API:
reasoning_effortenum ={low, medium, high, max, xhigh}— pass through verbatim (server mapslow|medium→high,xhigh→maxper the docs footnote).thinking.typeenum ={adaptive, enabled, disabled}.thinking.type=disabled+reasoning_effortset → 400 conflict, so omitreasoning_effortwhen thinking is disabled.minimaleffort is not in DeepSeek's enum → remap tolow.Test plan
reasoning_config={"enabled": True, "effort": "xhigh"}→ emitted body hasreasoning_effort=xhigh+extra_body={"thinking":{"type":"enabled"}}→ 121 reasoning tokens (vs ~50 default).get_date→get_weather→ final answer), all 200 OK.{"enabled": False}→ onlyextra_body.thinking={"type":"disabled"}, noreasoning_effort(avoids 400).