Skip to content

fix: support DeepSeek v4-flash/v4-pro thinking mode toggle and fix reasoning_content 400 error#15228

Closed
ruxme wants to merge 1 commit into
NousResearch:mainfrom
ruxme:fix/deepseek-thinking-mode
Closed

fix: support DeepSeek v4-flash/v4-pro thinking mode toggle and fix reasoning_content 400 error#15228
ruxme wants to merge 1 commit into
NousResearch:mainfrom
ruxme:fix/deepseek-thinking-mode

Conversation

@ruxme

@ruxme ruxme commented Apr 24, 2026

Copy link
Copy Markdown

Summary

Two fixes for DeepSeek direct API (api.deepseek.com) users who use deepseek-v4-flash / deepseek-v4-pro with thinking mode.

Changes

1. DeepSeek thinking mode toggle (is_deepseek flag + thinking extra_body)

DeepSeek v4-flash/v4-pro uses a different API format than OpenRouter for controlling thinking mode. While OpenRouter uses extra_body["reasoning"] = {"enabled": true, "effort": "medium"}, DeepSeek's native API expects:

{"thinking": {"type": "enabled"}}
{"thinking": {"type": "disabled"}}

This PR adds an is_deepseek flag threaded through run_agent.py -> chat_completions.py, which injects the correct thinking parameter. The thinking mode is controlled by the existing reasoning_effort config:

reasoning_effort Result
'' (empty) / 'medium' thinking: {"type": "enabled"}
'none' thinking: {"type": "disabled"}

2. Fix reasoning_content HTTP 400 error on multi-turn conversations

When thinking mode is enabled, DeepSeek requires reasoning_content on every assistant message, including plain-text responses. The existing _copy_reasoning_content_for_api() only patched tool-call messages (source_msg.get("tool_calls")), causing HTTP 400 errors on multi-turn conversations where the final assistant response has no tool calls.

Fix: Remove the and source_msg.get("tool_calls") condition for the DeepSeek branch, so reasoning_content is set (defaulting to empty string) on all assistant messages.

Files changed

  • run_agent.py_is_deepseek flag, pass is_deepseek to transport, fix _copy_reasoning_content_for_api for all assistant messages
  • agent/transports/chat_completions.py — accept is_deepseek param, inject thinking extra_body

Testing

  • Verified DeepSeek API accepts thinking: {"type": "enabled"} → returns reasoning_content
  • Verified DeepSeek API accepts thinking: {"type": "disabled"} → no reasoning_content
  • Verified multi-turn conversations no longer get 400 errors ✅

@alt-glitch alt-glitch added type/bug Something isn't working P1 High — major feature broken, no workaround comp/agent Core agent loop, run_agent.py, prompt builder provider/deepseek DeepSeek API labels Apr 24, 2026
@alt-glitch

Copy link
Copy Markdown
Collaborator

Related to #14958 (plumbs reasoning_effort/thinking toggle to V4 API) and #15213 (reasoning_content broken in auxiliary paths). This PR overlaps significantly with #14958 — maintainers should decide which to merge or combine.

@duouoduo

Copy link
Copy Markdown

Great fix, especially the thinking mode toggle and removing the tool_calls condition for DeepSeek — that is exactly the right approach.

We independently hit the same issue and opened #15237 with a few complementary fixes you might want to consider adding:

  1. _extract_reasoning — use isinstance(r, str) instead of truthy check and assistant_message.reasoning. This ensures empty strings from DeepSeek thinking mode are captured rather than silently skipped.

  2. chat_completions.py transport normalization — change if reasoning_content: to if reasoning_content is not None (same for reasoning_details). Empty strings "" pass the is not None check but fail a truthy check, and DeepSeek treats them as semantically significant in thinking mode.

  3. reasoning_model_prefixes — add "deepseek-" so models like deepseek-v4-flash and deepseek-v4-pro are recognized as reasoning models for prefilling.

Feel free to cherry-pick whatever makes sense. Happy to close #15237 if this covers the ground.

duouoduo pushed a commit to duouoduo/hermes-agent that referenced this pull request Apr 24, 2026
…nt from NousResearch#15228

- _copy_reasoning_content_for_api: split DeepSeek into separate block,
  use setdefault instead of tool_calls-gated assignment, so empty
  reasoning_content is set on ALL assistant messages (not just tool-call)
- chat_completions.py build_kwargs: add is_deepseek flag + thinking
  mode toggle (thinking: {type: enabled|disabled}) for native DeepSeek API
- _build_api_kwargs: plumb _is_deepseek through to transport params
- Keep our complementary fixes: isinstance checks in _extract_reasoning,
  is not None in chat_completions transport, deepseek- prefix
…asoning_content 400 error

Two fixes for DeepSeek direct API (api.deepseek.com) users:

1. Thinking mode toggle: DeepSeek v4-flash/v4-pro uses
   `thinking: {"type": "enabled|disabled"}` which is different from
   OpenRouter's `reasoning` extra_body format. Add `is_deepseek` flag
   and handler to send the correct thinking parameter, allowing users
   to control thinking mode via `reasoning_effort` config.

2. Fix reasoning_content 400 error: When thinking mode is enabled,
   DeepSeek requires `reasoning_content` on EVERY assistant message
   in the conversation, including plain-text final responses (not just
   tool-call messages). Previously only tool-call messages were
   patched, causing HTTP 400 errors on multi-turn conversations.

3. Complementary fixes suggested by duouoduo:
   - _extract_reasoning: use isinstance(r, str) instead of truthy
     checks to ensure empty strings are captured
   - chat_completions normalization: change if reasoning_content:
     to if reasoning_content is not None
   - reasoning_model_prefixes: add deepseek- prefix
@ruxme ruxme force-pushed the fix/deepseek-thinking-mode branch from fc3f5c2 to e276754 Compare April 24, 2026 17:39
@ruxme

ruxme commented Apr 24, 2026

Copy link
Copy Markdown
Author

Thanks for the detailed review and suggestions! I've applied all three complementary fixes:

  • _extract_reasoning: switched to isinstance(r, str) for reasoning, reasoning_content, and reasoning_details
  • chat_completions.py normalization: if reasoning_content:if reasoning_content is not None:, same for reasoning_details
  • reasoning_model_prefixes: added "deepseek-" prefix

Updated commit pushed to the same branch. Appreciate the help! 🙏

ukint-vs added a commit to ukint-vs/hermes-agent that referenced this pull request Apr 24, 2026
Review feedback (multi-specialist + adversarial):

1. **Tighten DeepSeek detection.** Broad `"deepseek" in model.lower()` match
   wrongly fires on Bedrock (`deepseek.v3.2`), NVIDIA NIM
   (`deepseek-ai/deepseek-v3.2`), HuggingFace, and third-party fine-tunes.
   These providers have NOT been validated to accept the `reasoning_content`
   field — injecting empty placeholders there risks regressions on paths
   that were previously fine. Switch to provider+host detection mirroring
   the Kimi branch shape: OpenRouter-routed `deepseek/*` models OR direct
   `api.deepseek.com`. No behavior change for OpenRouter or direct users;
   untested hosts are left alone.

2. **Harden reasoning_config predicate.** Adjacent code at line 7342 uses
   `isinstance(self.reasoning_config, dict)` before `.get()`. The new branch
   was missing that guard — would have AttributeErrored on non-dict shapes
   (string, Mock). Invert to an explicit "is False" disable check and guard
   the isinstance.

3. **Add trailing `return` after the DeepSeek injection.** Matches the
   pattern every prior branch in this function uses. Prevents a future
   branch inserted below from unconditionally firing after DeepSeek.

4. **Expand test coverage** from 6 → 12 cases:
   - `test_deepseek_injects_on_tool_call_message` — the entire point of
     dropping the tool_calls gate per NousResearch#15228, was untested.
   - `test_deepseek_reasoning_config_without_enabled_key` — common shape
     `{"effort": "high"}` with no `enabled` field.
   - `test_deepseek_reasoning_config_non_dict_is_safe` — pins the
     defensive isinstance guard.
   - `test_direct_deepseek_api_injects` — pins `api.deepseek.com` path.
   - `test_bedrock_deepseek_skips_injection` — regression: Bedrock
     `deepseek.v3.2` must NOT get injection.
   - `test_nvidia_nim_deepseek_skips_injection` — same for NIM.

Full suite: 994 passed, 0 failed.
@teknium1

Copy link
Copy Markdown
Contributor

Superseded by PR #15407 (#15354's approach, which matched the existing Kimi needs_tool_reasoning_echo pattern — gated on tool_calls rather than padding every assistant turn).

Thanks @ruxme for the parallel work and for independently diagnosing the same issue. A couple of notes on why we went with the other approach:

  1. Gating on source_msg.get("tool_calls") matches the Kimi precedent and the DeepSeek docs (reasoning_content is only required across a tool-call boundary between user messages).
  2. Three detection signals (provider / model / host) rather than host-only, so custom providers pointing at DeepSeek are covered.

Your thinking: {type: "enabled|disabled"} extra_body toggle is a separate feature worth doing — it lets users explicitly disable thinking mode on DeepSeek. If you'd like to open a follow-up PR just for that toggle (without the reasoning_content echo, which is now on main), we'd happily take it.

@teknium1 teknium1 closed this Apr 24, 2026
duouoduo pushed a commit to duouoduo/hermes-agent that referenced this pull request Apr 25, 2026
…nt from NousResearch#15228

- _copy_reasoning_content_for_api: split DeepSeek into separate block,
  use setdefault instead of tool_calls-gated assignment, so empty
  reasoning_content is set on ALL assistant messages (not just tool-call)
- chat_completions.py build_kwargs: add is_deepseek flag + thinking
  mode toggle (thinking: {type: enabled|disabled}) for native DeepSeek API
- _build_api_kwargs: plumb _is_deepseek through to transport params
- Keep our complementary fixes: isinstance checks in _extract_reasoning,
  is not None in chat_completions transport, deepseek- prefix
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder P1 High — major feature broken, no workaround provider/deepseek DeepSeek API type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants