Skip to content

fix(deepseek): V4 reasoning_content consistency backfill#15476

Closed
whtoo wants to merge 1 commit into
NousResearch:mainfrom
whtoo:fix/deepseek-v4-reasoning-backfill
Closed

fix(deepseek): V4 reasoning_content consistency backfill#15476
whtoo wants to merge 1 commit into
NousResearch:mainfrom
whtoo:fix/deepseek-v4-reasoning-backfill

Conversation

@whtoo

@whtoo whtoo commented Apr 25, 2026

Copy link
Copy Markdown

Problem

DeepSeek V4 models (deepseek-v4-pro, deepseek-v4-flash) require all assistant messages in the conversation history to have a reasoning_content field once any turn has produced reasoning. Mixing assistant messages with and without reasoning_content triggers HTTP 400 from the DeepSeek API.

Relation to Recent Upstream Fixes

Commits 93a2d6b3 and d58b305a (merged to main Apr 24–25) added _copy_reasoning_content_for_api() and _needs_deepseek_tool_reasoning() to protect tool-call assistant messages (#15250). This is excellent progress.

However, the upstream fix only covers assistant messages that contain tool_calls. Plain assistant turns (e.g., a simple acknowledgment like "Okay, let me check that") without tool_calls are not backfilled. In a long conversation where:

  1. Turn 2: assistant produces reasoning + tool calls → has reasoning_content
  2. Turn 4: assistant replies without reasoning or tool calls → no reasoning_content
  3. The API request contains a mix

DeepSeek V4 still rejects with HTTP 400.

This PR

This PR complements the upstream tool-call fixes by extending protection to all assistant messages:

  • Add _needs_reasoning_backfill(): detects when any assistant message in the history has reasoning (provider-scoped, excludes legacy deepseek-reasoner)
  • In three API message build paths, backfill empty "" reasoning_content for assistant messages that lack reasoning when backfill is needed:
    • Main conversation loop
    • Memory flush path
    • _handle_max_iterations summary path
  • Each backfill is applied after _copy_reasoning_content_for_api() so we don't interfere with existing Kimi/DeepSeek tool-call logic

Tests

10 regression tests in tests/agent/test_deepseek_v4_reasoning.py:

  • 5 tests for _needs_reasoning_backfill detection
  • 5 tests for reasoning_content injection across all three code paths

All passing on latest main.

@alt-glitch alt-glitch added type/bug Something isn't working P1 High — major feature broken, no workaround comp/agent Core agent loop, run_agent.py, prompt builder provider/deepseek DeepSeek API labels Apr 25, 2026
DeepSeek V4 requires all assistant messages to have reasoning_content
once any turn has produced reasoning. Mixing (some with, some without)
triggers HTTP 400 on DeepSeek reasoning models.

Upstream recently added _copy_reasoning_content_for_api() and
_needs_deepseek_tool_reasoning() for tool-call messages (NousResearch#15250).
This commit extends that protection to ALL assistant messages:

- Add _needs_reasoning_backfill(): detects when conversation history
  contains any assistant message with reasoning
- Three API message build paths now backfill empty reasoning_content
  for assistant messages that lack reasoning when backfill is needed:
  - Main conversation loop
  - Memory flush path
  - _handle_max_iterations summary path
- Complements upstream tool-call fixes by covering plain assistant
  acknowledgments and other non-tool turns
- Legacy deepseek-reasoner and non-DeepSeek providers are unaffected
- tests/agent/test_deepseek_v4_reasoning.py: 10 regression tests
@whtoo

whtoo commented Apr 26, 2026

Copy link
Copy Markdown
Author

This PR has been superseded by subsequent commits on main.

The reasoning_content consistency issue for DeepSeek V4 was resolved through a series of later PRs that landed on main:

  • ad0ac894 — DeepSeek/Kimi thinking mode requires reasoning_content on ALL assistant messages
  • 5ae60815 — Remove has_reasoning guard, inject empty reasoning_content unconditionally
  • 9daa0620 — Ordering fix in _copy_reasoning_content_for_api + cross-provider reasoning isolation

These commits implemented a more general solution than the conditional backfill approach proposed here:

Aspect PR #15476 Main (current)
Detection only , substring, and host match
Trigger conditional (history must contain reasoning) unconditional for all DeepSeek/Kimi assistant messages
Call sites 3 explicit backfill blocks 1 centralized in
included method removed in #15696 () — structural conflict root cause

Additionally, main already carries (213 lines) which covers the same scenarios plus Kimi and custom-provider base_url edge cases.

Because of the structural divergence ( removal + refactor), rebasing this PR would only introduce redundant code. Closing as obsolete.

@whtoo whtoo closed this Apr 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder P1 High — major feature broken, no workaround provider/deepseek DeepSeek API type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants