fix(agent): always persist reasoning_content on assistant turns#16884
Closed
Sanjays2402 wants to merge 1 commit into
Closed
fix(agent): always persist reasoning_content on assistant turns#16884Sanjays2402 wants to merge 1 commit into
Sanjays2402 wants to merge 1 commit into
Conversation
run_agent.py persisted assistant turns with the chain of thought under
the internal field 'reasoning' and only wrote the protocol-standard
'reasoning_content' when (a) the upstream SDK happened to expose it as
a top-level attribute, or (b) the current provider was DeepSeek and the
turn had tool_calls. Streaming-only providers (glm, MiniMax, gpt-5.x via
aigw, Anthropic via openai-compat shim, etc.) accumulate reasoning from
delta.reasoning_content into a local string but never set the SDK
attribute, so the persisted message is missing reasoning_content.
The bug is silent until the user later replays that history through a
DeepSeek-v4 / Kimi thinking model, which requires reasoning_content on
every replayed assistant turn:
The reasoning_content in the thinking mode must be passed back to
the API.
Read-side patches (NousResearch#15213, NousResearch#15741, NousResearch#15748, NousResearch#15353) each fix one build
path, but every new path that reads history from disk is a fresh place
the same 400 can resurface.
Normalize at write time:
- prefer the SDK-supplied reasoning_content when present (may carry
structured data);
- otherwise fall back to the already-sanitized reasoning_text that was
accumulated from streaming deltas;
- finally default to "" so non-thinking providers ignore the field
harmlessly while DeepSeek/Kimi see a valid (empty) value.
The internal 'reasoning' alias is preserved for backward compatibility
with existing read paths and downstream consumers.
Refs NousResearch#16844
teknium1
added a commit
that referenced
this pull request
Apr 28, 2026
…6844) (#16892) Streaming-only providers (glm, MiniMax, gpt-5.x via aigw, Anthropic via openai-compat shims) emit reasoning through delta.reasoning_content chunks that get accumulated into the local reasoning_text string — but never land on the assistant message object as a top-level attribute. The prior guard at _build_assistant_message only wrote reasoning_content when the SDK exposed hasattr(msg, 'reasoning_content'), so these providers persisted the chain-of-thought under the internal 'reasoning' key and omitted the protocol-standard field. The poison was silent until the user later switched to a DeepSeek-v4 or Kimi thinking model, at which point replay failed with HTTP 400: 'The reasoning_content in the thinking mode must be passed back to the API.' One reported session store accumulated 4,031 poisoned messages across 1,101 files (#16844). Fix: add an additive fallback that promotes the already-sanitized reasoning_text to reasoning_content when no earlier branch wrote it AND reasoning text was actually captured. Layered on top of the existing SDK-attr branch and DeepSeek ''-pad (#15250) rather than replacing them, so every existing behavior is preserved: - SDK-exposed reasoning_content (OpenAI/Moonshot/DeepSeek SDK) still wins. - DeepSeek tool-call ''-pad still fires when the SDK exposes the attr but the value is None. - Non-thinking turns with no reasoning leave the field absent, so _copy_reasoning_content_for_api's cross-provider leak guard (#15748), promote-from-'reasoning' tier, and thinking-pad tier remain live at replay time. - No empty '' gets eagerly written on every assistant turn (which would have bypassed the read-side ladder and triggered empty thinking-block insertion in the Anthropic adapter). Tests: three new TestBuildAssistantMessage cases covering the streaming promotion path, SDK precedence, and field-absent-when-no-reasoning invariant. Credit @Sanjays2402 for the original diagnosis and patch in #16884; this is a scoped rework that preserves the existing read-side compensation code as defense in depth. Refs #16844, #16884, #15250, #15353, #15748.
Contributor
|
Superseded by #16892 (merged as d63abbc). Thank you for the careful diagnosis and the write-side forensic data in #16844 — 4,031 poisoned messages across 1,101 files was the detail that made the bug's scope undeniable. The rework in #16892 keeps your core insight (promote streamed reasoning to
Live E2E confirmed the fix is load-bearing — before/after diff at scenario 1 goes from Credit preserved via the merged commit message. Thanks again! |
cluricaun28
referenced
this pull request
in cluricaun28/Logos
Apr 28, 2026
…6844) (#16892) Streaming-only providers (glm, MiniMax, gpt-5.x via aigw, Anthropic via openai-compat shims) emit reasoning through delta.reasoning_content chunks that get accumulated into the local reasoning_text string — but never land on the assistant message object as a top-level attribute. The prior guard at _build_assistant_message only wrote reasoning_content when the SDK exposed hasattr(msg, 'reasoning_content'), so these providers persisted the chain-of-thought under the internal 'reasoning' key and omitted the protocol-standard field. The poison was silent until the user later switched to a DeepSeek-v4 or Kimi thinking model, at which point replay failed with HTTP 400: 'The reasoning_content in the thinking mode must be passed back to the API.' One reported session store accumulated 4,031 poisoned messages across 1,101 files (#16844). Fix: add an additive fallback that promotes the already-sanitized reasoning_text to reasoning_content when no earlier branch wrote it AND reasoning text was actually captured. Layered on top of the existing SDK-attr branch and DeepSeek ''-pad (#15250) rather than replacing them, so every existing behavior is preserved: - SDK-exposed reasoning_content (OpenAI/Moonshot/DeepSeek SDK) still wins. - DeepSeek tool-call ''-pad still fires when the SDK exposes the attr but the value is None. - Non-thinking turns with no reasoning leave the field absent, so _copy_reasoning_content_for_api's cross-provider leak guard (#15748), promote-from-'reasoning' tier, and thinking-pad tier remain live at replay time. - No empty '' gets eagerly written on every assistant turn (which would have bypassed the read-side ladder and triggered empty thinking-block insertion in the Anthropic adapter). Tests: three new TestBuildAssistantMessage cases covering the streaming promotion path, SDK precedence, and field-absent-when-no-reasoning invariant. Credit @Sanjays2402 for the original diagnosis and patch in #16884; this is a scoped rework that preserves the existing read-side compensation code as defense in depth. Refs #16844, #16884, #15250, #15353, #15748.
5 tasks
ulasbilgen
pushed a commit
to ulasbilgen/hermes-adhd-agent
that referenced
this pull request
May 1, 2026
…usResearch#16844) (NousResearch#16892) Streaming-only providers (glm, MiniMax, gpt-5.x via aigw, Anthropic via openai-compat shims) emit reasoning through delta.reasoning_content chunks that get accumulated into the local reasoning_text string — but never land on the assistant message object as a top-level attribute. The prior guard at _build_assistant_message only wrote reasoning_content when the SDK exposed hasattr(msg, 'reasoning_content'), so these providers persisted the chain-of-thought under the internal 'reasoning' key and omitted the protocol-standard field. The poison was silent until the user later switched to a DeepSeek-v4 or Kimi thinking model, at which point replay failed with HTTP 400: 'The reasoning_content in the thinking mode must be passed back to the API.' One reported session store accumulated 4,031 poisoned messages across 1,101 files (NousResearch#16844). Fix: add an additive fallback that promotes the already-sanitized reasoning_text to reasoning_content when no earlier branch wrote it AND reasoning text was actually captured. Layered on top of the existing SDK-attr branch and DeepSeek ''-pad (NousResearch#15250) rather than replacing them, so every existing behavior is preserved: - SDK-exposed reasoning_content (OpenAI/Moonshot/DeepSeek SDK) still wins. - DeepSeek tool-call ''-pad still fires when the SDK exposes the attr but the value is None. - Non-thinking turns with no reasoning leave the field absent, so _copy_reasoning_content_for_api's cross-provider leak guard (NousResearch#15748), promote-from-'reasoning' tier, and thinking-pad tier remain live at replay time. - No empty '' gets eagerly written on every assistant turn (which would have bypassed the read-side ladder and triggered empty thinking-block insertion in the Anthropic adapter). Tests: three new TestBuildAssistantMessage cases covering the streaming promotion path, SDK precedence, and field-absent-when-no-reasoning invariant. Credit @Sanjays2402 for the original diagnosis and patch in NousResearch#16884; this is a scoped rework that preserves the existing read-side compensation code as defense in depth. Refs NousResearch#16844, NousResearch#16884, NousResearch#15250, NousResearch#15353, NousResearch#15748.
donald131
pushed a commit
to donald131/hermes-agent
that referenced
this pull request
May 2, 2026
…usResearch#16844) (NousResearch#16892) Streaming-only providers (glm, MiniMax, gpt-5.x via aigw, Anthropic via openai-compat shims) emit reasoning through delta.reasoning_content chunks that get accumulated into the local reasoning_text string — but never land on the assistant message object as a top-level attribute. The prior guard at _build_assistant_message only wrote reasoning_content when the SDK exposed hasattr(msg, 'reasoning_content'), so these providers persisted the chain-of-thought under the internal 'reasoning' key and omitted the protocol-standard field. The poison was silent until the user later switched to a DeepSeek-v4 or Kimi thinking model, at which point replay failed with HTTP 400: 'The reasoning_content in the thinking mode must be passed back to the API.' One reported session store accumulated 4,031 poisoned messages across 1,101 files (NousResearch#16844). Fix: add an additive fallback that promotes the already-sanitized reasoning_text to reasoning_content when no earlier branch wrote it AND reasoning text was actually captured. Layered on top of the existing SDK-attr branch and DeepSeek ''-pad (NousResearch#15250) rather than replacing them, so every existing behavior is preserved: - SDK-exposed reasoning_content (OpenAI/Moonshot/DeepSeek SDK) still wins. - DeepSeek tool-call ''-pad still fires when the SDK exposes the attr but the value is None. - Non-thinking turns with no reasoning leave the field absent, so _copy_reasoning_content_for_api's cross-provider leak guard (NousResearch#15748), promote-from-'reasoning' tier, and thinking-pad tier remain live at replay time. - No empty '' gets eagerly written on every assistant turn (which would have bypassed the read-side ladder and triggered empty thinking-block insertion in the Anthropic adapter). Tests: three new TestBuildAssistantMessage cases covering the streaming promotion path, SDK precedence, and field-absent-when-no-reasoning invariant. Credit @Sanjays2402 for the original diagnosis and patch in NousResearch#16884; this is a scoped rework that preserves the existing read-side compensation code as defense in depth. Refs NousResearch#16844, NousResearch#16884, NousResearch#15250, NousResearch#15353, NousResearch#15748.
02356abc
pushed a commit
to 02356abc/hermes-agent
that referenced
this pull request
May 14, 2026
…usResearch#16844) (NousResearch#16892) Streaming-only providers (glm, MiniMax, gpt-5.x via aigw, Anthropic via openai-compat shims) emit reasoning through delta.reasoning_content chunks that get accumulated into the local reasoning_text string — but never land on the assistant message object as a top-level attribute. The prior guard at _build_assistant_message only wrote reasoning_content when the SDK exposed hasattr(msg, 'reasoning_content'), so these providers persisted the chain-of-thought under the internal 'reasoning' key and omitted the protocol-standard field. The poison was silent until the user later switched to a DeepSeek-v4 or Kimi thinking model, at which point replay failed with HTTP 400: 'The reasoning_content in the thinking mode must be passed back to the API.' One reported session store accumulated 4,031 poisoned messages across 1,101 files (NousResearch#16844). Fix: add an additive fallback that promotes the already-sanitized reasoning_text to reasoning_content when no earlier branch wrote it AND reasoning text was actually captured. Layered on top of the existing SDK-attr branch and DeepSeek ''-pad (NousResearch#15250) rather than replacing them, so every existing behavior is preserved: - SDK-exposed reasoning_content (OpenAI/Moonshot/DeepSeek SDK) still wins. - DeepSeek tool-call ''-pad still fires when the SDK exposes the attr but the value is None. - Non-thinking turns with no reasoning leave the field absent, so _copy_reasoning_content_for_api's cross-provider leak guard (NousResearch#15748), promote-from-'reasoning' tier, and thinking-pad tier remain live at replay time. - No empty '' gets eagerly written on every assistant turn (which would have bypassed the read-side ladder and triggered empty thinking-block insertion in the Anthropic adapter). Tests: three new TestBuildAssistantMessage cases covering the streaming promotion path, SDK precedence, and field-absent-when-no-reasoning invariant. Credit @Sanjays2402 for the original diagnosis and patch in NousResearch#16884; this is a scoped rework that preserves the existing read-side compensation code as defense in depth. Refs NousResearch#16844, NousResearch#16884, NousResearch#15250, NousResearch#15353, NousResearch#15748.
dannyJ848
pushed a commit
to dannyJ848/hermes-agent
that referenced
this pull request
May 17, 2026
…usResearch#16844) (NousResearch#16892) Streaming-only providers (glm, MiniMax, gpt-5.x via aigw, Anthropic via openai-compat shims) emit reasoning through delta.reasoning_content chunks that get accumulated into the local reasoning_text string — but never land on the assistant message object as a top-level attribute. The prior guard at _build_assistant_message only wrote reasoning_content when the SDK exposed hasattr(msg, 'reasoning_content'), so these providers persisted the chain-of-thought under the internal 'reasoning' key and omitted the protocol-standard field. The poison was silent until the user later switched to a DeepSeek-v4 or Kimi thinking model, at which point replay failed with HTTP 400: 'The reasoning_content in the thinking mode must be passed back to the API.' One reported session store accumulated 4,031 poisoned messages across 1,101 files (NousResearch#16844). Fix: add an additive fallback that promotes the already-sanitized reasoning_text to reasoning_content when no earlier branch wrote it AND reasoning text was actually captured. Layered on top of the existing SDK-attr branch and DeepSeek ''-pad (NousResearch#15250) rather than replacing them, so every existing behavior is preserved: - SDK-exposed reasoning_content (OpenAI/Moonshot/DeepSeek SDK) still wins. - DeepSeek tool-call ''-pad still fires when the SDK exposes the attr but the value is None. - Non-thinking turns with no reasoning leave the field absent, so _copy_reasoning_content_for_api's cross-provider leak guard (NousResearch#15748), promote-from-'reasoning' tier, and thinking-pad tier remain live at replay time. - No empty '' gets eagerly written on every assistant turn (which would have bypassed the read-side ladder and triggered empty thinking-block insertion in the Anthropic adapter). Tests: three new TestBuildAssistantMessage cases covering the streaming promotion path, SDK precedence, and field-absent-when-no-reasoning invariant. Credit @Sanjays2402 for the original diagnosis and patch in NousResearch#16884; this is a scoped rework that preserves the existing read-side compensation code as defense in depth. Refs NousResearch#16844, NousResearch#16884, NousResearch#15250, NousResearch#15353, NousResearch#15748.
gweeteve
pushed a commit
to gweeteve/hermes-agent
that referenced
this pull request
Jun 2, 2026
…usResearch#16844) (NousResearch#16892) Streaming-only providers (glm, MiniMax, gpt-5.x via aigw, Anthropic via openai-compat shims) emit reasoning through delta.reasoning_content chunks that get accumulated into the local reasoning_text string — but never land on the assistant message object as a top-level attribute. The prior guard at _build_assistant_message only wrote reasoning_content when the SDK exposed hasattr(msg, 'reasoning_content'), so these providers persisted the chain-of-thought under the internal 'reasoning' key and omitted the protocol-standard field. The poison was silent until the user later switched to a DeepSeek-v4 or Kimi thinking model, at which point replay failed with HTTP 400: 'The reasoning_content in the thinking mode must be passed back to the API.' One reported session store accumulated 4,031 poisoned messages across 1,101 files (NousResearch#16844). Fix: add an additive fallback that promotes the already-sanitized reasoning_text to reasoning_content when no earlier branch wrote it AND reasoning text was actually captured. Layered on top of the existing SDK-attr branch and DeepSeek ''-pad (NousResearch#15250) rather than replacing them, so every existing behavior is preserved: - SDK-exposed reasoning_content (OpenAI/Moonshot/DeepSeek SDK) still wins. - DeepSeek tool-call ''-pad still fires when the SDK exposes the attr but the value is None. - Non-thinking turns with no reasoning leave the field absent, so _copy_reasoning_content_for_api's cross-provider leak guard (NousResearch#15748), promote-from-'reasoning' tier, and thinking-pad tier remain live at replay time. - No empty '' gets eagerly written on every assistant turn (which would have bypassed the read-side ladder and triggered empty thinking-block insertion in the Anthropic adapter). Tests: three new TestBuildAssistantMessage cases covering the streaming promotion path, SDK precedence, and field-absent-when-no-reasoning invariant. Credit @Sanjays2402 for the original diagnosis and patch in NousResearch#16884; this is a scoped rework that preserves the existing read-side compensation code as defense in depth. Refs NousResearch#16844, NousResearch#16884, NousResearch#15250, NousResearch#15353, NousResearch#15748.
Egavasyug
pushed a commit
to Egavasyug/hermes-agent
that referenced
this pull request
Jun 10, 2026
…usResearch#16844) (NousResearch#16892) Streaming-only providers (glm, MiniMax, gpt-5.x via aigw, Anthropic via openai-compat shims) emit reasoning through delta.reasoning_content chunks that get accumulated into the local reasoning_text string — but never land on the assistant message object as a top-level attribute. The prior guard at _build_assistant_message only wrote reasoning_content when the SDK exposed hasattr(msg, 'reasoning_content'), so these providers persisted the chain-of-thought under the internal 'reasoning' key and omitted the protocol-standard field. The poison was silent until the user later switched to a DeepSeek-v4 or Kimi thinking model, at which point replay failed with HTTP 400: 'The reasoning_content in the thinking mode must be passed back to the API.' One reported session store accumulated 4,031 poisoned messages across 1,101 files (NousResearch#16844). Fix: add an additive fallback that promotes the already-sanitized reasoning_text to reasoning_content when no earlier branch wrote it AND reasoning text was actually captured. Layered on top of the existing SDK-attr branch and DeepSeek ''-pad (NousResearch#15250) rather than replacing them, so every existing behavior is preserved: - SDK-exposed reasoning_content (OpenAI/Moonshot/DeepSeek SDK) still wins. - DeepSeek tool-call ''-pad still fires when the SDK exposes the attr but the value is None. - Non-thinking turns with no reasoning leave the field absent, so _copy_reasoning_content_for_api's cross-provider leak guard (NousResearch#15748), promote-from-'reasoning' tier, and thinking-pad tier remain live at replay time. - No empty '' gets eagerly written on every assistant turn (which would have bypassed the read-side ladder and triggered empty thinking-block insertion in the Anthropic adapter). Tests: three new TestBuildAssistantMessage cases covering the streaming promotion path, SDK precedence, and field-absent-when-no-reasoning invariant. Credit @Sanjays2402 for the original diagnosis and patch in NousResearch#16884; this is a scoped rework that preserves the existing read-side compensation code as defense in depth. Refs NousResearch#16844, NousResearch#16884, NousResearch#15250, NousResearch#15353, NousResearch#15748.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
run_agent.pypersists assistant turns with the chain of thought under the internal fieldreasoningand only writes the protocol-standardreasoning_contentwhen:assistant_message.reasoning_contentas a top-level attribute, ortool_calls(existing narrow guard).Streaming-only providers (
glm,MiniMax,gpt-5.xvia aigw, Anthropic via openai-compat shims, etc.) emit reasoning throughdelta.reasoning_contentchunks that get accumulated into the localreasoning_textstring — but never land on the assistant message object as an attribute, so the persisted message is missingreasoning_content.The bug is silent until the user later replays that history through a DeepSeek‑v4 / Kimi thinking model, which strictly requires
reasoning_contenton every replayed assistant turn:Read-side patches (#15213, #15741, #15748, #15353) each fix one build path, but every new path that reads history from disk is a fresh place the same 400 can resurface.
Fix
Normalize at write time. At the point where the assistant message dict is built (around
run_agent.py:8085), always populatereasoning_content:assistant_message.reasoning_contentwhen present (may carry structured data);reasoning_textthat was accumulated from streaming deltas;""so non-thinking providers ignore it harmlessly while DeepSeek/Kimi see a valid (empty) value.The internal
reasoningalias is preserved for backward compatibility with existing read paths and downstream consumers — this PR is purely additive on the wire format.This makes the four landed read-side fixes redundant safety nets rather than mandatory promotion paths.
Diff
The change is local to the assistant-message persistence block; no behavioural change for currently-working DeepSeek paths (those already get
reasoning_contentfrom the SDK or from the existing tool-call guard).Not in scope
~/.hermes/sessions/**— happy to follow up in a separate PR if maintainers want it; the original issue author offered one as well._copy_reasoning_content_for_apipromotion logic — leaving it in place as defense in depth.Refs #16844