fix(agent): always persist reasoning_content on assistant turns by Sanjays2402 · Pull Request #16884 · NousResearch/hermes-agent

Sanjays2402 · 2026-04-28T07:49:16Z

What

run_agent.py persists assistant turns with the chain of thought under the internal field reasoning and only writes the protocol-standard reasoning_content when:

the upstream SDK exposes assistant_message.reasoning_content as a top-level attribute, or
the current provider is DeepSeek and the turn has tool_calls (existing narrow guard).

Streaming-only providers (glm, MiniMax, gpt-5.x via aigw, Anthropic via openai-compat shims, etc.) emit reasoning through delta.reasoning_content chunks that get accumulated into the local reasoning_text string — but never land on the assistant message object as an attribute, so the persisted message is missing reasoning_content.

The bug is silent until the user later replays that history through a DeepSeek‑v4 / Kimi thinking model, which strictly requires reasoning_content on every replayed assistant turn:

The reasoning_content in the thinking mode must be passed back to the API.

Read-side patches (#15213, #15741, #15748, #15353) each fix one build path, but every new path that reads history from disk is a fresh place the same 400 can resurface.

Fix

Normalize at write time. At the point where the assistant message dict is built (around run_agent.py:8085), always populate reasoning_content:

prefer the SDK-supplied assistant_message.reasoning_content when present (may carry structured data);
otherwise fall back to the already-sanitized reasoning_text that was accumulated from streaming deltas;
finally default to "" so non-thinking providers ignore it harmlessly while DeepSeek/Kimi see a valid (empty) value.

The internal reasoning alias is preserved for backward compatibility with existing read paths and downstream consumers — this PR is purely additive on the wire format.

This makes the four landed read-side fixes redundant safety nets rather than mandatory promotion paths.

Diff

run_agent.py | 27 +++++++++++++++++----------
1 file changed, 17 insertions(+), 10 deletions(-)

The change is local to the assistant-message persistence block; no behavioural change for currently-working DeepSeek paths (those already get reasoning_content from the SDK or from the existing tool-call guard).

Not in scope

Migration of existing poisoned session files under ~/.hermes/sessions/** — happy to follow up in a separate PR if maintainers want it; the original issue author offered one as well.
Removing the read-side _copy_reasoning_content_for_api promotion logic — leaving it in place as defense in depth.

Refs #16844

run_agent.py persisted assistant turns with the chain of thought under the internal field 'reasoning' and only wrote the protocol-standard 'reasoning_content' when (a) the upstream SDK happened to expose it as a top-level attribute, or (b) the current provider was DeepSeek and the turn had tool_calls. Streaming-only providers (glm, MiniMax, gpt-5.x via aigw, Anthropic via openai-compat shim, etc.) accumulate reasoning from delta.reasoning_content into a local string but never set the SDK attribute, so the persisted message is missing reasoning_content. The bug is silent until the user later replays that history through a DeepSeek-v4 / Kimi thinking model, which requires reasoning_content on every replayed assistant turn: The reasoning_content in the thinking mode must be passed back to the API. Read-side patches (NousResearch#15213, NousResearch#15741, NousResearch#15748, NousResearch#15353) each fix one build path, but every new path that reads history from disk is a fresh place the same 400 can resurface. Normalize at write time: - prefer the SDK-supplied reasoning_content when present (may carry structured data); - otherwise fall back to the already-sanitized reasoning_text that was accumulated from streaming deltas; - finally default to "" so non-thinking providers ignore the field harmlessly while DeepSeek/Kimi see a valid (empty) value. The internal 'reasoning' alias is preserved for backward compatibility with existing read paths and downstream consumers. Refs NousResearch#16844

@Sanjays2402

…6844) (#16892) Streaming-only providers (glm, MiniMax, gpt-5.x via aigw, Anthropic via openai-compat shims) emit reasoning through delta.reasoning_content chunks that get accumulated into the local reasoning_text string — but never land on the assistant message object as a top-level attribute. The prior guard at _build_assistant_message only wrote reasoning_content when the SDK exposed hasattr(msg, 'reasoning_content'), so these providers persisted the chain-of-thought under the internal 'reasoning' key and omitted the protocol-standard field. The poison was silent until the user later switched to a DeepSeek-v4 or Kimi thinking model, at which point replay failed with HTTP 400: 'The reasoning_content in the thinking mode must be passed back to the API.' One reported session store accumulated 4,031 poisoned messages across 1,101 files (#16844). Fix: add an additive fallback that promotes the already-sanitized reasoning_text to reasoning_content when no earlier branch wrote it AND reasoning text was actually captured. Layered on top of the existing SDK-attr branch and DeepSeek ''-pad (#15250) rather than replacing them, so every existing behavior is preserved: - SDK-exposed reasoning_content (OpenAI/Moonshot/DeepSeek SDK) still wins. - DeepSeek tool-call ''-pad still fires when the SDK exposes the attr but the value is None. - Non-thinking turns with no reasoning leave the field absent, so _copy_reasoning_content_for_api's cross-provider leak guard (#15748), promote-from-'reasoning' tier, and thinking-pad tier remain live at replay time. - No empty '' gets eagerly written on every assistant turn (which would have bypassed the read-side ladder and triggered empty thinking-block insertion in the Anthropic adapter). Tests: three new TestBuildAssistantMessage cases covering the streaming promotion path, SDK precedence, and field-absent-when-no-reasoning invariant. Credit @Sanjays2402 for the original diagnosis and patch in #16884; this is a scoped rework that preserves the existing read-side compensation code as defense in depth. Refs #16844, #16884, #15250, #15353, #15748.

teknium1 · 2026-04-28T08:19:37Z

Superseded by #16892 (merged as d63abbc).

Thank you for the careful diagnosis and the write-side forensic data in #16844 — 4,031 poisoned messages across 1,101 files was the detail that made the bug's scope undeniable. The rework in #16892 keeps your core insight (promote streamed reasoning to reasoning_content at write time) but layers it on top of the existing branch rather than replacing it, so:

The Anthropic adapter at anthropic_adapter.py:1280 doesn't start inserting empty {"type":"thinking"} blocks on every replayed assistant turn.
Strict OpenAI-compat providers (Mistral, Fireworks, stock OpenAI) don't receive reasoning_content: "" on every replay message.
_copy_reasoning_content_for_api's cross-provider reasoning-leak guard ([Bug] _copy_reasoning_content_for_api: cross-provider reasoning promotion leaks stale content to DeepSeek/Kimi #15748) stays live instead of being short-circuited by step 1 always matching.

Live E2E confirmed the fix is load-bearing — before/after diff at scenario 1 goes from reasoning_content absent to populated from the streamed reasoning text, exactly matching the #16844 bug profile.

Credit preserved via the merged commit message. Thanks again!

@Sanjays2402

…6844) (#16892) Streaming-only providers (glm, MiniMax, gpt-5.x via aigw, Anthropic via openai-compat shims) emit reasoning through delta.reasoning_content chunks that get accumulated into the local reasoning_text string — but never land on the assistant message object as a top-level attribute. The prior guard at _build_assistant_message only wrote reasoning_content when the SDK exposed hasattr(msg, 'reasoning_content'), so these providers persisted the chain-of-thought under the internal 'reasoning' key and omitted the protocol-standard field. The poison was silent until the user later switched to a DeepSeek-v4 or Kimi thinking model, at which point replay failed with HTTP 400: 'The reasoning_content in the thinking mode must be passed back to the API.' One reported session store accumulated 4,031 poisoned messages across 1,101 files (#16844). Fix: add an additive fallback that promotes the already-sanitized reasoning_text to reasoning_content when no earlier branch wrote it AND reasoning text was actually captured. Layered on top of the existing SDK-attr branch and DeepSeek ''-pad (#15250) rather than replacing them, so every existing behavior is preserved: - SDK-exposed reasoning_content (OpenAI/Moonshot/DeepSeek SDK) still wins. - DeepSeek tool-call ''-pad still fires when the SDK exposes the attr but the value is None. - Non-thinking turns with no reasoning leave the field absent, so _copy_reasoning_content_for_api's cross-provider leak guard (#15748), promote-from-'reasoning' tier, and thinking-pad tier remain live at replay time. - No empty '' gets eagerly written on every assistant turn (which would have bypassed the read-side ladder and triggered empty thinking-block insertion in the Anthropic adapter). Tests: three new TestBuildAssistantMessage cases covering the streaming promotion path, SDK precedence, and field-absent-when-no-reasoning invariant. Credit @Sanjays2402 for the original diagnosis and patch in #16884; this is a scoped rework that preserves the existing read-side compensation code as defense in depth. Refs #16844, #16884, #15250, #15353, #15748.

@Sanjays2402

…usResearch#16844) (NousResearch#16892) Streaming-only providers (glm, MiniMax, gpt-5.x via aigw, Anthropic via openai-compat shims) emit reasoning through delta.reasoning_content chunks that get accumulated into the local reasoning_text string — but never land on the assistant message object as a top-level attribute. The prior guard at _build_assistant_message only wrote reasoning_content when the SDK exposed hasattr(msg, 'reasoning_content'), so these providers persisted the chain-of-thought under the internal 'reasoning' key and omitted the protocol-standard field. The poison was silent until the user later switched to a DeepSeek-v4 or Kimi thinking model, at which point replay failed with HTTP 400: 'The reasoning_content in the thinking mode must be passed back to the API.' One reported session store accumulated 4,031 poisoned messages across 1,101 files (NousResearch#16844). Fix: add an additive fallback that promotes the already-sanitized reasoning_text to reasoning_content when no earlier branch wrote it AND reasoning text was actually captured. Layered on top of the existing SDK-attr branch and DeepSeek ''-pad (NousResearch#15250) rather than replacing them, so every existing behavior is preserved: - SDK-exposed reasoning_content (OpenAI/Moonshot/DeepSeek SDK) still wins. - DeepSeek tool-call ''-pad still fires when the SDK exposes the attr but the value is None. - Non-thinking turns with no reasoning leave the field absent, so _copy_reasoning_content_for_api's cross-provider leak guard (NousResearch#15748), promote-from-'reasoning' tier, and thinking-pad tier remain live at replay time. - No empty '' gets eagerly written on every assistant turn (which would have bypassed the read-side ladder and triggered empty thinking-block insertion in the Anthropic adapter). Tests: three new TestBuildAssistantMessage cases covering the streaming promotion path, SDK precedence, and field-absent-when-no-reasoning invariant. Credit @Sanjays2402 for the original diagnosis and patch in NousResearch#16884; this is a scoped rework that preserves the existing read-side compensation code as defense in depth. Refs NousResearch#16844, NousResearch#16884, NousResearch#15250, NousResearch#15353, NousResearch#15748.

@Sanjays2402

…usResearch#16844) (NousResearch#16892) Streaming-only providers (glm, MiniMax, gpt-5.x via aigw, Anthropic via openai-compat shims) emit reasoning through delta.reasoning_content chunks that get accumulated into the local reasoning_text string — but never land on the assistant message object as a top-level attribute. The prior guard at _build_assistant_message only wrote reasoning_content when the SDK exposed hasattr(msg, 'reasoning_content'), so these providers persisted the chain-of-thought under the internal 'reasoning' key and omitted the protocol-standard field. The poison was silent until the user later switched to a DeepSeek-v4 or Kimi thinking model, at which point replay failed with HTTP 400: 'The reasoning_content in the thinking mode must be passed back to the API.' One reported session store accumulated 4,031 poisoned messages across 1,101 files (NousResearch#16844). Fix: add an additive fallback that promotes the already-sanitized reasoning_text to reasoning_content when no earlier branch wrote it AND reasoning text was actually captured. Layered on top of the existing SDK-attr branch and DeepSeek ''-pad (NousResearch#15250) rather than replacing them, so every existing behavior is preserved: - SDK-exposed reasoning_content (OpenAI/Moonshot/DeepSeek SDK) still wins. - DeepSeek tool-call ''-pad still fires when the SDK exposes the attr but the value is None. - Non-thinking turns with no reasoning leave the field absent, so _copy_reasoning_content_for_api's cross-provider leak guard (NousResearch#15748), promote-from-'reasoning' tier, and thinking-pad tier remain live at replay time. - No empty '' gets eagerly written on every assistant turn (which would have bypassed the read-side ladder and triggered empty thinking-block insertion in the Anthropic adapter). Tests: three new TestBuildAssistantMessage cases covering the streaming promotion path, SDK precedence, and field-absent-when-no-reasoning invariant. Credit @Sanjays2402 for the original diagnosis and patch in NousResearch#16884; this is a scoped rework that preserves the existing read-side compensation code as defense in depth. Refs NousResearch#16844, NousResearch#16884, NousResearch#15250, NousResearch#15353, NousResearch#15748.

@Sanjays2402

…usResearch#16844) (NousResearch#16892) Streaming-only providers (glm, MiniMax, gpt-5.x via aigw, Anthropic via openai-compat shims) emit reasoning through delta.reasoning_content chunks that get accumulated into the local reasoning_text string — but never land on the assistant message object as a top-level attribute. The prior guard at _build_assistant_message only wrote reasoning_content when the SDK exposed hasattr(msg, 'reasoning_content'), so these providers persisted the chain-of-thought under the internal 'reasoning' key and omitted the protocol-standard field. The poison was silent until the user later switched to a DeepSeek-v4 or Kimi thinking model, at which point replay failed with HTTP 400: 'The reasoning_content in the thinking mode must be passed back to the API.' One reported session store accumulated 4,031 poisoned messages across 1,101 files (NousResearch#16844). Fix: add an additive fallback that promotes the already-sanitized reasoning_text to reasoning_content when no earlier branch wrote it AND reasoning text was actually captured. Layered on top of the existing SDK-attr branch and DeepSeek ''-pad (NousResearch#15250) rather than replacing them, so every existing behavior is preserved: - SDK-exposed reasoning_content (OpenAI/Moonshot/DeepSeek SDK) still wins. - DeepSeek tool-call ''-pad still fires when the SDK exposes the attr but the value is None. - Non-thinking turns with no reasoning leave the field absent, so _copy_reasoning_content_for_api's cross-provider leak guard (NousResearch#15748), promote-from-'reasoning' tier, and thinking-pad tier remain live at replay time. - No empty '' gets eagerly written on every assistant turn (which would have bypassed the read-side ladder and triggered empty thinking-block insertion in the Anthropic adapter). Tests: three new TestBuildAssistantMessage cases covering the streaming promotion path, SDK precedence, and field-absent-when-no-reasoning invariant. Credit @Sanjays2402 for the original diagnosis and patch in NousResearch#16884; this is a scoped rework that preserves the existing read-side compensation code as defense in depth. Refs NousResearch#16844, NousResearch#16884, NousResearch#15250, NousResearch#15353, NousResearch#15748.

@Sanjays2402

…usResearch#16844) (NousResearch#16892) Streaming-only providers (glm, MiniMax, gpt-5.x via aigw, Anthropic via openai-compat shims) emit reasoning through delta.reasoning_content chunks that get accumulated into the local reasoning_text string — but never land on the assistant message object as a top-level attribute. The prior guard at _build_assistant_message only wrote reasoning_content when the SDK exposed hasattr(msg, 'reasoning_content'), so these providers persisted the chain-of-thought under the internal 'reasoning' key and omitted the protocol-standard field. The poison was silent until the user later switched to a DeepSeek-v4 or Kimi thinking model, at which point replay failed with HTTP 400: 'The reasoning_content in the thinking mode must be passed back to the API.' One reported session store accumulated 4,031 poisoned messages across 1,101 files (NousResearch#16844). Fix: add an additive fallback that promotes the already-sanitized reasoning_text to reasoning_content when no earlier branch wrote it AND reasoning text was actually captured. Layered on top of the existing SDK-attr branch and DeepSeek ''-pad (NousResearch#15250) rather than replacing them, so every existing behavior is preserved: - SDK-exposed reasoning_content (OpenAI/Moonshot/DeepSeek SDK) still wins. - DeepSeek tool-call ''-pad still fires when the SDK exposes the attr but the value is None. - Non-thinking turns with no reasoning leave the field absent, so _copy_reasoning_content_for_api's cross-provider leak guard (NousResearch#15748), promote-from-'reasoning' tier, and thinking-pad tier remain live at replay time. - No empty '' gets eagerly written on every assistant turn (which would have bypassed the read-side ladder and triggered empty thinking-block insertion in the Anthropic adapter). Tests: three new TestBuildAssistantMessage cases covering the streaming promotion path, SDK precedence, and field-absent-when-no-reasoning invariant. Credit @Sanjays2402 for the original diagnosis and patch in NousResearch#16884; this is a scoped rework that preserves the existing read-side compensation code as defense in depth. Refs NousResearch#16844, NousResearch#16884, NousResearch#15250, NousResearch#15353, NousResearch#15748.

@Sanjays2402

…usResearch#16844) (NousResearch#16892) Streaming-only providers (glm, MiniMax, gpt-5.x via aigw, Anthropic via openai-compat shims) emit reasoning through delta.reasoning_content chunks that get accumulated into the local reasoning_text string — but never land on the assistant message object as a top-level attribute. The prior guard at _build_assistant_message only wrote reasoning_content when the SDK exposed hasattr(msg, 'reasoning_content'), so these providers persisted the chain-of-thought under the internal 'reasoning' key and omitted the protocol-standard field. The poison was silent until the user later switched to a DeepSeek-v4 or Kimi thinking model, at which point replay failed with HTTP 400: 'The reasoning_content in the thinking mode must be passed back to the API.' One reported session store accumulated 4,031 poisoned messages across 1,101 files (NousResearch#16844). Fix: add an additive fallback that promotes the already-sanitized reasoning_text to reasoning_content when no earlier branch wrote it AND reasoning text was actually captured. Layered on top of the existing SDK-attr branch and DeepSeek ''-pad (NousResearch#15250) rather than replacing them, so every existing behavior is preserved: - SDK-exposed reasoning_content (OpenAI/Moonshot/DeepSeek SDK) still wins. - DeepSeek tool-call ''-pad still fires when the SDK exposes the attr but the value is None. - Non-thinking turns with no reasoning leave the field absent, so _copy_reasoning_content_for_api's cross-provider leak guard (NousResearch#15748), promote-from-'reasoning' tier, and thinking-pad tier remain live at replay time. - No empty '' gets eagerly written on every assistant turn (which would have bypassed the read-side ladder and triggered empty thinking-block insertion in the Anthropic adapter). Tests: three new TestBuildAssistantMessage cases covering the streaming promotion path, SDK precedence, and field-absent-when-no-reasoning invariant. Credit @Sanjays2402 for the original diagnosis and patch in NousResearch#16884; this is a scoped rework that preserves the existing read-side compensation code as defense in depth. Refs NousResearch#16844, NousResearch#16884, NousResearch#15250, NousResearch#15353, NousResearch#15748.

@Sanjays2402

…usResearch#16844) (NousResearch#16892) Streaming-only providers (glm, MiniMax, gpt-5.x via aigw, Anthropic via openai-compat shims) emit reasoning through delta.reasoning_content chunks that get accumulated into the local reasoning_text string — but never land on the assistant message object as a top-level attribute. The prior guard at _build_assistant_message only wrote reasoning_content when the SDK exposed hasattr(msg, 'reasoning_content'), so these providers persisted the chain-of-thought under the internal 'reasoning' key and omitted the protocol-standard field. The poison was silent until the user later switched to a DeepSeek-v4 or Kimi thinking model, at which point replay failed with HTTP 400: 'The reasoning_content in the thinking mode must be passed back to the API.' One reported session store accumulated 4,031 poisoned messages across 1,101 files (NousResearch#16844). Fix: add an additive fallback that promotes the already-sanitized reasoning_text to reasoning_content when no earlier branch wrote it AND reasoning text was actually captured. Layered on top of the existing SDK-attr branch and DeepSeek ''-pad (NousResearch#15250) rather than replacing them, so every existing behavior is preserved: - SDK-exposed reasoning_content (OpenAI/Moonshot/DeepSeek SDK) still wins. - DeepSeek tool-call ''-pad still fires when the SDK exposes the attr but the value is None. - Non-thinking turns with no reasoning leave the field absent, so _copy_reasoning_content_for_api's cross-provider leak guard (NousResearch#15748), promote-from-'reasoning' tier, and thinking-pad tier remain live at replay time. - No empty '' gets eagerly written on every assistant turn (which would have bypassed the read-side ladder and triggered empty thinking-block insertion in the Anthropic adapter). Tests: three new TestBuildAssistantMessage cases covering the streaming promotion path, SDK precedence, and field-absent-when-no-reasoning invariant. Credit @Sanjays2402 for the original diagnosis and patch in NousResearch#16884; this is a scoped rework that preserves the existing read-side compensation code as defense in depth. Refs NousResearch#16844, NousResearch#16884, NousResearch#15250, NousResearch#15353, NousResearch#15748.

alt-glitch added type/bug Something isn't working P1 High — major feature broken, no workaround comp/agent Core agent loop, run_agent.py, prompt builder provider/deepseek DeepSeek API provider/kimi Kimi / Moonshot labels Apr 28, 2026

teknium1 mentioned this pull request Apr 28, 2026

fix(agent): persist streamed reasoning_content on assistant turns (#16844) #16892

Merged

teknium1 closed this Apr 28, 2026

teknium1 mentioned this pull request Apr 28, 2026

[Bug]: persisted assistant messages store reasoning in 'reasoning' (internal) instead of 'reasoning_content', leaving sessions silently poisoned for any future DeepSeek/Kimi thinking-mode replay #16844

Closed

donovan-yohan mentioned this pull request Apr 29, 2026

chore: upstream sync 2026-04-29 donovan-yohan/hermes-agent#14

Merged

5 tasks

saved-j mentioned this pull request May 17, 2026

feat: make reasoning_content echo-back detection dynamic (protocol-level, not provider-name-based) #27297

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(agent): always persist reasoning_content on assistant turns#16884

fix(agent): always persist reasoning_content on assistant turns#16884
Sanjays2402 wants to merge 1 commit into
NousResearch:mainfrom
Sanjays2402:fix/issue-16844

Sanjays2402 commented Apr 28, 2026

Uh oh!

teknium1 commented Apr 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Sanjays2402 commented Apr 28, 2026

What

Fix

Diff

Not in scope

Uh oh!

teknium1 commented Apr 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants