Skip to content

fix(deepseek): wire thinking-mode via DeepSeekProfile (closes #15700, #17212, #17825)#26648

Merged
teknium1 merged 2 commits into
mainfrom
hermes/hermes-c6fea4b1
May 16, 2026
Merged

fix(deepseek): wire thinking-mode via DeepSeekProfile (closes #15700, #17212, #17825)#26648
teknium1 merged 2 commits into
mainfrom
hermes/hermes-c6fea4b1

Conversation

@teknium1

Copy link
Copy Markdown
Contributor

Summary

DeepSeek V4 / deepseek-reasoner requests now go out with the explicit extra_body.thinking + reasoning_effort parameters they require, instead of silently defaulting to thinking-on and 400ing on the next turn.

Root cause. DeepSeek has a registered ProviderProfile, so requests go through _build_kwargs_from_profile. The base profile's build_api_kwargs_extras returns ({}, {}) — so every request to api.deepseek.com went out with no thinking parameter and no reasoning_effort. DeepSeek defaults to thinking=enabled, returns reasoning_content, then 400s on subsequent turns. Confirmed empirically by instantiating the live profile.

@tw2818's PR #15251 correctly diagnosed this but placed the fix in build_kwargs's legacy fallback path, which DeepSeek never reaches. Cherry-picked their commit to preserve authorship, then pivoted the fix to the profile path in the follow-up commit.

Changes

  • plugins/model-providers/deepseek/__init__.py: new DeepSeekProfile(ProviderProfile) mirroring KimiProfile. Emits extra_body.thinking = {type: enabled|disabled} + top-level reasoning_effort. Model gating: V4 family + deepseek-reasoner only; deepseek-chat (V3) untouched.
  • scripts/release.py: AUTHOR_MAP entry for @tw2818.
  • tests/plugins/model_providers/test_deepseek_profile.py: 26 unit tests pinning wire shape, model gating, effort mapping, and full-kwargs integration through the transport.

Validation

Before After
deepseek-v4-pro wire format {} (no thinking, no effort) {reasoning_effort, extra_body.thinking}
Wire format matches live test no yes (matches tests/run_agent/test_deepseek_v4_thinking_live.py::_thinking_kwargs)
New tests n/a 26/26 passing
Transport + provider profile suites 321 passing 321 passing (no regressions)

Closes #15700.
Closes #17212.
Closes #17825.

Co-authored-by: tw2818 twebefy@gmail.com

twebefy28 and others added 2 commits May 15, 2026 16:39
…Seek API

DeepSeek's thinking mode requires both:
- extra_body.thinking.type: "enabled" to activate thinking mode
- top-level reasoning_effort: "max" or "high" to control depth

Previously, the ChatCompletionsTransport only handled Kimi's thinking
mode — DeepSeek was left unmapped, so reasoning_effort config was
silently dropped.

This patch:
1. Adds is_deepseek: bool to the Params dataclass, detected by
   base_url matching api.deepseek.com
2. Maps Hermes effort levels (xhigh/max → "max", low/medium/high →
   themselves) to the top-level reasoning_effort parameter
3. Sets extra_body.thinking.type alongside the effort
4. Strips reasoning_content from assistant messages sent back to
   DeepSeek, preventing 400 errors when thinking was enabled
…lback

The cherry-picked PR #15251 from @tw2818 correctly identified the
DeepSeek 400 root cause but placed the fix in the legacy fallback path
of `build_kwargs`, which DeepSeek never reaches — DeepSeek has a
registered ProviderProfile and goes through `_build_kwargs_from_profile`
instead. The legacy-path block was therefore dead code.

This commit pivots the fix to where it actually fires:

- New `DeepSeekProfile` in `plugins/model-providers/deepseek/__init__.py`
  overrides `build_api_kwargs_extras` to emit DeepSeek's expected wire
  format (mirrors `KimiProfile`):

      {"reasoning_effort": "<low|medium|high|max>",
       "extra_body": {"thinking": {"type": "enabled" | "disabled"}}}

- Model gating: only `deepseek-v4-*` and `deepseek-reasoner` emit
  thinking control. `deepseek-chat` (V3) is untouched — current behavior.

- Effort mapping: low/medium/high passthrough, xhigh/max → max, unset →
  omitted (DeepSeek server applies its own default).

- Revert the legacy-path additions from PR #15251 — they were dead code,
  and the `_copy_reasoning_content_for_api` strip block specifically
  would have nullified the existing reasoning_content padding machinery
  (`_needs_deepseek_tool_reasoning` → space-pad on replay) that the
  active provider already relies on for replay correctness.

- Unit tests pin the wire-shape contract and the model gating rules
  (26 tests, all passing). Existing transport + provider profile suites
  (321 tests) continue to pass.

- AUTHOR_MAP: map twebefy@gmail.com → tw2818 for release notes credit.

Closes #15700, #17212, #17825.
Co-authored-by: tw2818 <twebefy@gmail.com>
@github-actions

Copy link
Copy Markdown
Contributor

🔎 Lint report: hermes/hermes-c6fea4b1 vs origin/main

ruff

Total: 0 on HEAD, 0 on base (➖ 0)

🆕 New issues: none

✅ Fixed issues: none

Unchanged: 0 pre-existing issues carried over.

ty (type checker)

Total: 8302 on HEAD, 8301 on base (🆕 +1)

🆕 New issues (4):

Rule Count
invalid-argument-type 3
unresolved-import 1
First entries
run_agent.py:7614: [invalid-argument-type] invalid-argument-type: Argument to function `build_anthropic_client` is incorrect: Expected `str`, found `str | dict[Unknown, Unknown] | Any | ... omitted 3 union elements`
run_agent.py:13897: [invalid-argument-type] invalid-argument-type: Argument to function `_is_oauth_token` is incorrect: Expected `str`, found `str | dict[Unknown, Unknown] | Any | ... omitted 3 union elements`
tests/plugins/model_providers/test_deepseek_profile.py:15: [unresolved-import] unresolved-import: Cannot resolve imported module `pytest`
run_agent.py:13900: [invalid-argument-type] invalid-argument-type: Argument to function `len` is incorrect: Expected `Sized`, found `(str & ~AlwaysFalsy) | (dict[Unknown, Unknown] & ~AlwaysFalsy) | (Any & ~AlwaysFalsy) | ... omitted 3 union elements`

✅ Fixed issues (3):

Rule Count
invalid-argument-type 3
First entries
run_agent.py:13900: [invalid-argument-type] invalid-argument-type: Argument to function `len` is incorrect: Expected `Sized`, found `(str & ~AlwaysFalsy) | (dict[Unknown | str, Unknown | str | dict[str, str]] & ~AlwaysFalsy) | (Any & ~AlwaysFalsy) | ... omitted 3 union elements`
run_agent.py:13897: [invalid-argument-type] invalid-argument-type: Argument to function `_is_oauth_token` is incorrect: Expected `str`, found `str | dict[Unknown | str, Unknown | str | dict[str, str]] | Any | ... omitted 3 union elements`
run_agent.py:7614: [invalid-argument-type] invalid-argument-type: Argument to function `build_anthropic_client` is incorrect: Expected `str`, found `str | dict[Unknown | str, Unknown | str | dict[str, str]] | Any | ... omitted 3 union elements`

Unchanged: 4333 pre-existing issues carried over.

Diagnostics are surfaced as warnings — this check never fails the build.

@alt-glitch alt-glitch added type/bug Something isn't working P3 Low — cosmetic, nice to have comp/plugins Plugin system and bundled plugins labels May 15, 2026
@alt-glitch

Copy link
Copy Markdown
Collaborator

Supersedes #22218, #21668, #24130, #25301 — all competing PRs for the same DeepSeek thinking-mode/reasoning_effort fix. This PR (by teknium1) is the canonical version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/plugins Plugin system and bundled plugins P3 Low — cosmetic, nice to have type/bug Something isn't working

Projects

None yet

3 participants