feat(deepseek): plumb reasoning_effort and thinking toggle to V4 API by zkl2333 · Pull Request #14958 · NousResearch/hermes-agent

zkl2333 · 2026-04-24T07:08:24Z

Summary

Follow-up to #14934 / #14946 / #14952. With V4 ids resolvable end-to-end, the remaining gap is that reasoning_effort is a no-op on the native DeepSeek provider — the value never reached the request, and thinking mode was only selectable via the soon-to-be-deprecated deepseek-reasoner alias trick.

Official V4 contract

DeepSeek V4 exposes thinking mode as an explicit API contract (https://api-docs.deepseek.com/guides/thinking_mode):

Signal	Location	Values	Default
Thinking toggle	`extra_body.thinking`	`{"type": "enabled"}` / `{"type": "disabled"}`	`enabled`
Effort level	top-level `reasoning_effort`	`"high"` / `"max"`	`"high"` (auto `"max"` for complex agents)

Thinking mode forbids temperature, top_p, presence_penalty, frequency_penalty — these must be stripped when enabled.

Hermes → DeepSeek effort mapping

Hermes uses a 5-level vocabulary (low/medium/high/xhigh + implicit max). DeepSeek only accepts two values, so this PR maps:

Hermes `reasoning_effort`	→ DeepSeek `reasoning_effort`
`low`	`high`
`medium`	`high`
`high`	`high`
`xhigh`	`max`
`max`	`max`
(thinking disabled)	omitted

This matches the "compatibility mapping" DeepSeek already documents for external clients.

Changes

agent/transports/chat_completions.py — new is_deepseek branch, effort mapping, incompatible-param strip. Structure mirrors the existing Kimi branch one screen up.
run_agent.py — _is_deepseek detection (hostname api.deepseek.com) + plumb through build_kwargs, symmetric with Kimi / OpenRouter / Nous / NVIDIA NIM detection.
tests/agent/transports/test_chat_completions.py — new TestChatCompletionsDeepSeek covering default-enabled thinking, explicit disable, effort mapping (all 5 inputs + case/whitespace tolerance), unknown-effort fallback, sampling-param stripping, temperature preservation when thinking disabled, and non-is_deepseek isolation.

Test plan

pytest tests/agent/transports/test_chat_completions.py — 54 passed (13 new + 41 pre-existing)
Live against api.deepseek.com:
- deepseek-v4-pro + reasoning_effort=xhigh (→ max) + thinking enabled → HTTP 200, reasoning_content populated (167 chars, thinking mode actually ran), temperature correctly omitted
- deepseek-v4-flash + reasoning.enabled=False → HTTP 200, reasoning_content empty (thinking off), temperature=0.5 preserved and accepted by the server

DeepSeek V4 exposes thinking mode as an explicit API contract rather than via the legacy deepseek-chat / deepseek-reasoner model alias trick: - extra_body.thinking = {"type": "enabled" | "disabled"} (default enabled) - top-level reasoning_effort = "high" | "max" (low/medium map to high, xhigh maps to max) - Thinking mode forbids temperature, top_p, presence_penalty, frequency_penalty. Add an is_deepseek branch in the chat_completions transport that mirrors the existing Kimi pattern but applies the DeepSeek-specific effort mapping and strips the incompatible sampling params when thinking is enabled. Wire detection in run_agent.py via the api.deepseek.com hostname. Without this plumbing, users setting reasoning_effort on DeepSeek had no effect: the value never reached the request, and thinking mode was only selectable by routing through the soon-to-be-deprecated "deepseek-reasoner" alias. Reference: https://api-docs.deepseek.com/guides/thinking_mode

zkl2333 · 2026-05-12T02:55:55Z

Superseded by #24130 — same contract rebuilt on the provider_profile architecture (build_api_kwargs_extras override on DeepSeekProfile, mirroring KimiProfile). The bug is still real on main — the registered DeepSeek profile inherits the no-op default and reasoning_effort is silently dropped.

alt-glitch added type/feature New feature or request P1 High — major feature broken, no workaround comp/agent Core agent loop, run_agent.py, prompt builder provider/deepseek DeepSeek API labels Apr 24, 2026

Tranquil-Flow mentioned this pull request Apr 25, 2026

fix(agent): comprehensive DeepSeek V4 support — context windows, thinking mode, reasoning replay #15446

Closed

6 tasks

This was referenced Apr 25, 2026

feat: add DeepSeek-V4 thinking mode via unified thinking_mode parameter #15577

Open

feat(agent): pass reasoning_effort + thinking to DeepSeek API #16448

Open

Skyline10124 mentioned this pull request Apr 27, 2026

fix(transport): emit deepseek-v4 thinking.type and reasoning_effort on non-OpenRouter routes #16614

Closed

12 tasks

This was referenced May 4, 2026

[Feature]: Native reasoning_effort support for NVIDIA NIM (integrate.api.nvidia.com) #19883

Open

feat(transport): plumb reasoning_effort to NVIDIA NIM #19888

Open

alt-glitch mentioned this pull request May 8, 2026

feat(opencode-go): add top-level reasoning_effort support for DeepSeek models #21577

Closed

zkl2333 mentioned this pull request May 12, 2026

feat(deepseek): plumb V4 thinking + reasoning_effort via provider profile #24130

Closed

2 tasks

zkl2333 closed this May 12, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(deepseek): plumb reasoning_effort and thinking toggle to V4 API#14958

feat(deepseek): plumb reasoning_effort and thinking toggle to V4 API#14958
zkl2333 wants to merge 1 commit into
NousResearch:mainfrom
zkl2333:feat/deepseek-v4-reasoning-effort

zkl2333 commented Apr 24, 2026 •

edited

Loading

Uh oh!

zkl2333 commented May 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

zkl2333 commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Official V4 contract

Hermes → DeepSeek effort mapping

Changes

Test plan

Uh oh!

zkl2333 commented May 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

zkl2333 commented Apr 24, 2026 •

edited

Loading