fix: support DeepSeek v4-flash/v4-pro thinking mode toggle and fix reasoning_content 400 error by ruxme · Pull Request #15228 · NousResearch/hermes-agent

ruxme · 2026-04-24T16:02:26Z

Summary

Two fixes for DeepSeek direct API (api.deepseek.com) users who use deepseek-v4-flash / deepseek-v4-pro with thinking mode.

Changes

1. DeepSeek thinking mode toggle (`is_deepseek` flag + `thinking` extra_body)

DeepSeek v4-flash/v4-pro uses a different API format than OpenRouter for controlling thinking mode. While OpenRouter uses extra_body["reasoning"] = {"enabled": true, "effort": "medium"}, DeepSeek's native API expects:

{"thinking": {"type": "enabled"}}
{"thinking": {"type": "disabled"}}

This PR adds an is_deepseek flag threaded through run_agent.py -> chat_completions.py, which injects the correct thinking parameter. The thinking mode is controlled by the existing reasoning_effort config:

`reasoning_effort`	Result
`''` (empty) / `'medium'`	`thinking: {"type": "enabled"}`
`'none'`	`thinking: {"type": "disabled"}`

2. Fix `reasoning_content` HTTP 400 error on multi-turn conversations

When thinking mode is enabled, DeepSeek requires reasoning_content on every assistant message, including plain-text responses. The existing _copy_reasoning_content_for_api() only patched tool-call messages (source_msg.get("tool_calls")), causing HTTP 400 errors on multi-turn conversations where the final assistant response has no tool calls.

Fix: Remove the and source_msg.get("tool_calls") condition for the DeepSeek branch, so reasoning_content is set (defaulting to empty string) on all assistant messages.

Files changed

run_agent.py — _is_deepseek flag, pass is_deepseek to transport, fix _copy_reasoning_content_for_api for all assistant messages
agent/transports/chat_completions.py — accept is_deepseek param, inject thinking extra_body

Testing

Verified DeepSeek API accepts thinking: {"type": "enabled"} → returns reasoning_content ✅
Verified DeepSeek API accepts thinking: {"type": "disabled"} → no reasoning_content ✅
Verified multi-turn conversations no longer get 400 errors ✅

alt-glitch · 2026-04-24T16:10:29Z

Related to #14958 (plumbs reasoning_effort/thinking toggle to V4 API) and #15213 (reasoning_content broken in auxiliary paths). This PR overlaps significantly with #14958 — maintainers should decide which to merge or combine.

duouoduo · 2026-04-24T16:25:50Z

Great fix, especially the thinking mode toggle and removing the tool_calls condition for DeepSeek — that is exactly the right approach.

We independently hit the same issue and opened #15237 with a few complementary fixes you might want to consider adding:

_extract_reasoning — use isinstance(r, str) instead of truthy check and assistant_message.reasoning. This ensures empty strings from DeepSeek thinking mode are captured rather than silently skipped.
chat_completions.py transport normalization — change if reasoning_content: to if reasoning_content is not None (same for reasoning_details). Empty strings "" pass the is not None check but fail a truthy check, and DeepSeek treats them as semantically significant in thinking mode.
reasoning_model_prefixes — add "deepseek-" so models like deepseek-v4-flash and deepseek-v4-pro are recognized as reasoning models for prefilling.

Feel free to cherry-pick whatever makes sense. Happy to close #15237 if this covers the ground.

…nt from NousResearch#15228 - _copy_reasoning_content_for_api: split DeepSeek into separate block, use setdefault instead of tool_calls-gated assignment, so empty reasoning_content is set on ALL assistant messages (not just tool-call) - chat_completions.py build_kwargs: add is_deepseek flag + thinking mode toggle (thinking: {type: enabled|disabled}) for native DeepSeek API - _build_api_kwargs: plumb _is_deepseek through to transport params - Keep our complementary fixes: isinstance checks in _extract_reasoning, is not None in chat_completions transport, deepseek- prefix

…asoning_content 400 error Two fixes for DeepSeek direct API (api.deepseek.com) users: 1. Thinking mode toggle: DeepSeek v4-flash/v4-pro uses `thinking: {"type": "enabled|disabled"}` which is different from OpenRouter's `reasoning` extra_body format. Add `is_deepseek` flag and handler to send the correct thinking parameter, allowing users to control thinking mode via `reasoning_effort` config. 2. Fix reasoning_content 400 error: When thinking mode is enabled, DeepSeek requires `reasoning_content` on EVERY assistant message in the conversation, including plain-text final responses (not just tool-call messages). Previously only tool-call messages were patched, causing HTTP 400 errors on multi-turn conversations. 3. Complementary fixes suggested by duouoduo: - _extract_reasoning: use isinstance(r, str) instead of truthy checks to ensure empty strings are captured - chat_completions normalization: change if reasoning_content: to if reasoning_content is not None - reasoning_model_prefixes: add deepseek- prefix

ruxme · 2026-04-24T17:39:43Z

Thanks for the detailed review and suggestions! I've applied all three complementary fixes:

_extract_reasoning: switched to isinstance(r, str) for reasoning, reasoning_content, and reasoning_details
chat_completions.py normalization: if reasoning_content: → if reasoning_content is not None:, same for reasoning_details
reasoning_model_prefixes: added "deepseek-" prefix

Updated commit pushed to the same branch. Appreciate the help! 🙏

Review feedback (multi-specialist + adversarial): 1. **Tighten DeepSeek detection.** Broad `"deepseek" in model.lower()` match wrongly fires on Bedrock (`deepseek.v3.2`), NVIDIA NIM (`deepseek-ai/deepseek-v3.2`), HuggingFace, and third-party fine-tunes. These providers have NOT been validated to accept the `reasoning_content` field — injecting empty placeholders there risks regressions on paths that were previously fine. Switch to provider+host detection mirroring the Kimi branch shape: OpenRouter-routed `deepseek/*` models OR direct `api.deepseek.com`. No behavior change for OpenRouter or direct users; untested hosts are left alone. 2. **Harden reasoning_config predicate.** Adjacent code at line 7342 uses `isinstance(self.reasoning_config, dict)` before `.get()`. The new branch was missing that guard — would have AttributeErrored on non-dict shapes (string, Mock). Invert to an explicit "is False" disable check and guard the isinstance. 3. **Add trailing `return` after the DeepSeek injection.** Matches the pattern every prior branch in this function uses. Prevents a future branch inserted below from unconditionally firing after DeepSeek. 4. **Expand test coverage** from 6 → 12 cases: - `test_deepseek_injects_on_tool_call_message` — the entire point of dropping the tool_calls gate per NousResearch#15228, was untested. - `test_deepseek_reasoning_config_without_enabled_key` — common shape `{"effort": "high"}` with no `enabled` field. - `test_deepseek_reasoning_config_non_dict_is_safe` — pins the defensive isinstance guard. - `test_direct_deepseek_api_injects` — pins `api.deepseek.com` path. - `test_bedrock_deepseek_skips_injection` — regression: Bedrock `deepseek.v3.2` must NOT get injection. - `test_nvidia_nim_deepseek_skips_injection` — same for NIM. Full suite: 994 passed, 0 failed.

teknium1 · 2026-04-24T23:38:49Z

Superseded by PR #15407 (#15354's approach, which matched the existing Kimi needs_tool_reasoning_echo pattern — gated on tool_calls rather than padding every assistant turn).

Thanks @ruxme for the parallel work and for independently diagnosing the same issue. A couple of notes on why we went with the other approach:

Gating on source_msg.get("tool_calls") matches the Kimi precedent and the DeepSeek docs (reasoning_content is only required across a tool-call boundary between user messages).
Three detection signals (provider / model / host) rather than host-only, so custom providers pointing at DeepSeek are covered.

Your thinking: {type: "enabled|disabled"} extra_body toggle is a separate feature worth doing — it lets users explicitly disable thinking mode on DeepSeek. If you'd like to open a follow-up PR just for that toggle (without the reasoning_content echo, which is now on main), we'd happily take it.

…nt from NousResearch#15228 - _copy_reasoning_content_for_api: split DeepSeek into separate block, use setdefault instead of tool_calls-gated assignment, so empty reasoning_content is set on ALL assistant messages (not just tool-call) - chat_completions.py build_kwargs: add is_deepseek flag + thinking mode toggle (thinking: {type: enabled|disabled}) for native DeepSeek API - _build_api_kwargs: plumb _is_deepseek through to transport params - Keep our complementary fixes: isinstance checks in _extract_reasoning, is not None in chat_completions transport, deepseek- prefix

alt-glitch added type/bug Something isn't working P1 High — major feature broken, no workaround comp/agent Core agent loop, run_agent.py, prompt builder provider/deepseek DeepSeek API labels Apr 24, 2026

alt-glitch mentioned this pull request Apr 24, 2026

fix: preserve empty reasoning_content for DeepSeek thinking mode across turns #15237

Closed

4 tasks

ruxme force-pushed the fix/deepseek-thinking-mode branch from fc3f5c2 to e276754 Compare April 24, 2026 17:39

asin76-svg mentioned this pull request Apr 24, 2026

fix(kilo): workaround DeepSeek thinking-mode 400 on Kilo gateway (v4-pro/v4-flash/reasoner) #15323

Closed

ukint-vs mentioned this pull request Apr 24, 2026

fix(deepseek): inject empty reasoning_content on replay for OpenRouter DeepSeek #15325

Closed

teknium1 mentioned this pull request Apr 24, 2026

fix: DeepSeek V4 thinking mode reasoning_content echo on tool-call messages #15407

Merged

teknium1 closed this Apr 24, 2026

Tranquil-Flow mentioned this pull request Apr 25, 2026

fix(agent): comprehensive DeepSeek V4 support — context windows, thinking mode, reasoning replay #15446

Closed

6 tasks

alt-glitch mentioned this pull request Apr 25, 2026

[Bug]: HTTP 400 "reasoning_content must be passed back" with deepseek-v4-pro in cron/auxiliary path (thinking mode works in main loop, breaks elsewhere) #15213

Closed

1 task

alt-glitch mentioned this pull request Apr 25, 2026

DeepSeek API: Missing 'thinking: disabled' parameter causes 400 error #15700

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: support DeepSeek v4-flash/v4-pro thinking mode toggle and fix reasoning_content 400 error#15228

fix: support DeepSeek v4-flash/v4-pro thinking mode toggle and fix reasoning_content 400 error#15228
ruxme wants to merge 1 commit into
NousResearch:mainfrom
ruxme:fix/deepseek-thinking-mode

ruxme commented Apr 24, 2026

Uh oh!

alt-glitch commented Apr 24, 2026

Uh oh!

duouoduo commented Apr 24, 2026

Uh oh!

ruxme commented Apr 24, 2026

Uh oh!

teknium1 commented Apr 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

ruxme commented Apr 24, 2026

Summary

Changes

1. DeepSeek thinking mode toggle (is_deepseek flag + thinking extra_body)

2. Fix reasoning_content HTTP 400 error on multi-turn conversations

Files changed

Testing

Uh oh!

alt-glitch commented Apr 24, 2026

Uh oh!

duouoduo commented Apr 24, 2026

Uh oh!

ruxme commented Apr 24, 2026

Uh oh!

teknium1 commented Apr 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

1. DeepSeek thinking mode toggle (`is_deepseek` flag + `thinking` extra_body)

2. Fix `reasoning_content` HTTP 400 error on multi-turn conversations