Skip to content

fix(deepseek): inject empty reasoning_content on replay for OpenRouter DeepSeek#15325

Closed
ukint-vs wants to merge 3 commits into
NousResearch:mainfrom
ukint-vs:fix/deepseek-reasoning-content-openrouter
Closed

fix(deepseek): inject empty reasoning_content on replay for OpenRouter DeepSeek#15325
ukint-vs wants to merge 3 commits into
NousResearch:mainfrom
ukint-vs:fix/deepseek-reasoning-content-openrouter

Conversation

@ukint-vs

@ukint-vs ukint-vs commented Apr 24, 2026

Copy link
Copy Markdown

DeepSeek v4 in thinking mode 400s on multi-turn replay when any assistant message in history lacks reasoning_content:

HTTP 400: The reasoning_content in the thinking mode must be passed back to the API.

Hits three paths:

  1. Compressed histories (compressor synthesizes assistant messages without the field).
  2. Sessions persisted before thinking-mode support.
  3. The _handle_max_iterations summary turn, which builds its own api_messages and was missing the _copy_reasoning_content_for_api call.

This PR covers OpenRouter-routed DeepSeek (deepseek/*) — #15228 handles direct api.deepseek.com; they co-exist.

Fix

  • run_agent.py::_copy_reasoning_content_for_api — add a DeepSeek branch that injects reasoning_content="" as a placeholder. Detection mirrors the Kimi branch's shape: host is openrouter.ai with a deepseek/ model slug, OR host is api.deepseek.com. Deliberately scoped — a bare "deepseek" in model substring would wrongly fire on Bedrock, NIM, and third-party hosts we haven't validated.
  • run_agent.py::_handle_max_iterations — missing _copy_reasoning_content_for_api call in the summary-turn api_messages builder.
  • isinstance(dict) guard on reasoning_config mirrors line 7342.

Tests

New file tests/run_agent/test_deepseek_reasoning_content.py, 12 cases:

Full suite: 994 passed.

Refs

Tested on

macOS 15 · Python 3.11.15 · openai 2.30.0 · pydantic 2.12.5 · live OpenRouter DeepSeek V4 Flash traffic, no 400s after patch.


Three commits on the branch for reviewability — squash at merge is fine.

…r DeepSeek

DeepSeek v4/v4-flash/v4-pro in thinking mode rejects multi-turn conversations
when replayed assistant messages lack `reasoning_content`, failing with:

    HTTP 400: The reasoning_content in the thinking mode must be passed back.

`_copy_reasoning_content_for_api` already injects an empty placeholder for
Kimi/Moonshot on tool-call turns. Extend the same fallback to DeepSeek models,
matching the direction of NousResearch#15228 but detecting by model-name substring so it
covers OpenRouter-routed DeepSeek (`deepseek/deepseek-v4-flash`) as well as
direct `api.deepseek.com` usage.

The `tool_calls` gate from the Kimi branch is dropped here on purpose — per
NousResearch#15228, DeepSeek requires `reasoning_content` on every assistant message in
history, not only tool-call turns. Real reasoning is still preserved by the
earlier `explicit_reasoning` / `normalized_reasoning` branches; the empty
string only fills in when no reasoning was captured upstream (compressed
summaries and replays of sessions persisted before this fix).

Why not patch `normalize_response` instead (c.f. NousResearch#14973): on current openai
SDK (2.30.0) + pydantic 2.x with `extra='allow'`, `getattr(msg, 'reasoning_content')`
already reaches `model_extra` via `__getattr__`, so the capture path works on
modern SDKs. The real leak is the round-trip — covered here.

Refs:
- Fixes symptom described in NousResearch#15213 for OpenRouter-routed DeepSeek
- Complements NousResearch#15228 (direct api.deepseek.com path)
- Supersedes NousResearch#14973 for modern SDK versions

Tests: tests/run_agent/test_deepseek_reasoning_content.py covers six cases —
enabled-injects, disabled-skips, non-deepseek-skips, preserves-real-reasoning,
reasoning_config-None-defaults-enabled, preserves-explicit-empty.
…path

`_handle_max_iterations` builds its own `api_messages` list when requesting
the "please summarize" final turn after max_iterations. It stripped internal
fields (`reasoning`, `finish_reason`, `_thinking_prefill`) but never called
`_copy_reasoning_content_for_api`. That call is the remap step: it copies
`source_msg["reasoning"]` (trajectory-only field) into `api_msg["reasoning_content"]`
(the field DeepSeek requires on every assistant message in thinking mode).

Without the call, DeepSeek 400s on the summary request, surfacing as:

    WARNING root: Failed to get summary response: Error code: 400 ...
    The `reasoning_content` in the thinking mode must be passed back to the API.

The main loop (~line 9395) and flush_memories path (~line 7628) already made
this call. `_handle_max_iterations` was the missing third site.

Refs:
- Same bug class as NousResearch#15213 flagged for auxiliary paths
- Extends the fix introduced in the prior commit on this branch
Review feedback (multi-specialist + adversarial):

1. **Tighten DeepSeek detection.** Broad `"deepseek" in model.lower()` match
   wrongly fires on Bedrock (`deepseek.v3.2`), NVIDIA NIM
   (`deepseek-ai/deepseek-v3.2`), HuggingFace, and third-party fine-tunes.
   These providers have NOT been validated to accept the `reasoning_content`
   field — injecting empty placeholders there risks regressions on paths
   that were previously fine. Switch to provider+host detection mirroring
   the Kimi branch shape: OpenRouter-routed `deepseek/*` models OR direct
   `api.deepseek.com`. No behavior change for OpenRouter or direct users;
   untested hosts are left alone.

2. **Harden reasoning_config predicate.** Adjacent code at line 7342 uses
   `isinstance(self.reasoning_config, dict)` before `.get()`. The new branch
   was missing that guard — would have AttributeErrored on non-dict shapes
   (string, Mock). Invert to an explicit "is False" disable check and guard
   the isinstance.

3. **Add trailing `return` after the DeepSeek injection.** Matches the
   pattern every prior branch in this function uses. Prevents a future
   branch inserted below from unconditionally firing after DeepSeek.

4. **Expand test coverage** from 6 → 12 cases:
   - `test_deepseek_injects_on_tool_call_message` — the entire point of
     dropping the tool_calls gate per NousResearch#15228, was untested.
   - `test_deepseek_reasoning_config_without_enabled_key` — common shape
     `{"effort": "high"}` with no `enabled` field.
   - `test_deepseek_reasoning_config_non_dict_is_safe` — pins the
     defensive isinstance guard.
   - `test_direct_deepseek_api_injects` — pins `api.deepseek.com` path.
   - `test_bedrock_deepseek_skips_injection` — regression: Bedrock
     `deepseek.v3.2` must NOT get injection.
   - `test_nvidia_nim_deepseek_skips_injection` — same for NIM.

Full suite: 994 passed, 0 failed.
@alt-glitch alt-glitch added type/bug Something isn't working P1 High — major feature broken, no workaround comp/agent Core agent loop, run_agent.py, prompt builder provider/deepseek DeepSeek API provider/openrouter OpenRouter aggregator labels Apr 24, 2026
@alt-glitch

Copy link
Copy Markdown
Collaborator

Related to #15228 (direct DeepSeek API fix) and #15323 (Kilo gateway variant). This PR covers the OpenRouter-routed path specifically. Also addresses symptoms in #15250.

@ukint-vs

Copy link
Copy Markdown
Author

Closing — superseded by #15446, which unifies this fix with related DeepSeek V4 work (thinking-mode toggle, context windows, normalize_response empty-string preservation) into a single coherent PR. Same detection scope (OpenRouter deepseek/ + api.deepseek.com), same enabled: false respect, plus the _handle_max_iterations call I had as a second commit here. Happy to help test or review #15446 if useful.

@ukint-vs ukint-vs closed this Apr 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder P1 High — major feature broken, no workaround provider/deepseek DeepSeek API provider/openrouter OpenRouter aggregator type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants