Skip to content

fix(agent): persist streamed reasoning_content on assistant turns (#16844)#16892

Merged
teknium1 merged 1 commit into
mainfrom
hermes/hermes-908ed086
Apr 28, 2026
Merged

fix(agent): persist streamed reasoning_content on assistant turns (#16844)#16892
teknium1 merged 1 commit into
mainfrom
hermes/hermes-908ed086

Conversation

@teknium1

Copy link
Copy Markdown
Contributor

Supersedes #16884 with a scoped rework that preserves existing read-side compensation.

Summary

Streaming-only providers (glm, MiniMax, gpt-5.x via aigw, Anthropic via openai-compat shims) accumulate reasoning through delta.reasoning_content chunks but never expose it as a top-level attribute on the finalized SDK message. The existing hasattr-guarded block at _build_assistant_message therefore never wrote reasoning_content for those providers, so the chain-of-thought was persisted only under the internal reasoning key.

The poison is silent until the user later switches to a DeepSeek-v4 or Kimi thinking model, at which point replay 400s with "The reasoning_content in the thinking mode must be passed back to the API." Issue #16844 reports 4,031 poisoned messages across 1,101 session files on one install.

Changes

Why not #16884 as-written

#16884 replaced the conditional with msg["reasoning_content"] = "" as a universal fallback, which would have:

  1. Triggered the Anthropic adapter's isinstance(reasoning_content, str) branch to prepend an empty {"type":"thinking","thinking":""} block on every replayed assistant turn.
  2. Sent reasoning_content: "" to every strict OpenAI-compatible provider (Mistral, Fireworks, stock OpenAI, GitHub Models).
  3. Short-circuited _copy_reasoning_content_for_api's step-1 string check on every turn, making tiers 2–4 dead code — including [Bug] _copy_reasoning_content_for_api: cross-provider reasoning promotion leaks stale content to DeepSeek/Kimi #15748's cross-provider reasoning-leak guard for DeepSeek/Kimi.

The layered approach here solves the write-side bug without changing the field's presence semantics for non-thinking sessions. Existing read-side ladder (cross-provider leak guard #15748, promote-from-reasoning, DeepSeek/Kimi thinking-pad) stays live as defense in depth.

Validation

Before After
Streaming reasoning persisted (glm, MiniMax, gpt-5.x via aigw) missing reasoning_content → 400 on DeepSeek/Kimi replay promoted at write time
SDK-exposed reasoning_content preserved preserved
DeepSeek tool-call ""-pad (#15250) fires fires
Non-thinking turns (no reasoning) field absent field absent
_copy_reasoning_content_for_api tiers 2–4 live live
#15748 cross-provider leak guard live live
Targeted tests: TestBuildAssistantMessage + test_deepseek_reasoning_content_echo n/a 15 + 23 passing

Credit @Sanjays2402 for the original diagnosis in #16884.

Refs #16844, #16884, #15250, #15353, #15748.

…6844)

Streaming-only providers (glm, MiniMax, gpt-5.x via aigw, Anthropic via
openai-compat shims) emit reasoning through delta.reasoning_content
chunks that get accumulated into the local reasoning_text string — but
never land on the assistant message object as a top-level attribute. The
prior guard at _build_assistant_message only wrote reasoning_content
when the SDK exposed hasattr(msg, 'reasoning_content'), so these
providers persisted the chain-of-thought under the internal 'reasoning'
key and omitted the protocol-standard field.

The poison was silent until the user later switched to a DeepSeek-v4 or
Kimi thinking model, at which point replay failed with HTTP 400:
'The reasoning_content in the thinking mode must be passed back to the
API.' One reported session store accumulated 4,031 poisoned messages
across 1,101 files (#16844).

Fix: add an additive fallback that promotes the already-sanitized
reasoning_text to reasoning_content when no earlier branch wrote it AND
reasoning text was actually captured. Layered on top of the existing
SDK-attr branch and DeepSeek ''-pad (#15250) rather than replacing them,
so every existing behavior is preserved:

- SDK-exposed reasoning_content (OpenAI/Moonshot/DeepSeek SDK) still
  wins.
- DeepSeek tool-call ''-pad still fires when the SDK exposes the attr
  but the value is None.
- Non-thinking turns with no reasoning leave the field absent, so
  _copy_reasoning_content_for_api's cross-provider leak guard (#15748),
  promote-from-'reasoning' tier, and thinking-pad tier remain live at
  replay time.
- No empty '' gets eagerly written on every assistant turn (which would
  have bypassed the read-side ladder and triggered empty thinking-block
  insertion in the Anthropic adapter).

Tests: three new TestBuildAssistantMessage cases covering the streaming
promotion path, SDK precedence, and field-absent-when-no-reasoning
invariant.

Credit @Sanjays2402 for the original diagnosis and patch in #16884;
this is a scoped rework that preserves the existing read-side
compensation code as defense in depth.

Refs #16844, #16884, #15250, #15353, #15748.
@teknium1 teknium1 merged commit d63abbc into main Apr 28, 2026
11 of 12 checks passed
@teknium1 teknium1 deleted the hermes/hermes-908ed086 branch April 28, 2026 08:19
@alt-glitch alt-glitch added type/bug Something isn't working P1 High — major feature broken, no workaround comp/agent Core agent loop, run_agent.py, prompt builder labels Apr 28, 2026
cluricaun28 referenced this pull request in cluricaun28/Logos Apr 28, 2026
…6844) (#16892)

Streaming-only providers (glm, MiniMax, gpt-5.x via aigw, Anthropic via
openai-compat shims) emit reasoning through delta.reasoning_content
chunks that get accumulated into the local reasoning_text string — but
never land on the assistant message object as a top-level attribute. The
prior guard at _build_assistant_message only wrote reasoning_content
when the SDK exposed hasattr(msg, 'reasoning_content'), so these
providers persisted the chain-of-thought under the internal 'reasoning'
key and omitted the protocol-standard field.

The poison was silent until the user later switched to a DeepSeek-v4 or
Kimi thinking model, at which point replay failed with HTTP 400:
'The reasoning_content in the thinking mode must be passed back to the
API.' One reported session store accumulated 4,031 poisoned messages
across 1,101 files (#16844).

Fix: add an additive fallback that promotes the already-sanitized
reasoning_text to reasoning_content when no earlier branch wrote it AND
reasoning text was actually captured. Layered on top of the existing
SDK-attr branch and DeepSeek ''-pad (#15250) rather than replacing them,
so every existing behavior is preserved:

- SDK-exposed reasoning_content (OpenAI/Moonshot/DeepSeek SDK) still
  wins.
- DeepSeek tool-call ''-pad still fires when the SDK exposes the attr
  but the value is None.
- Non-thinking turns with no reasoning leave the field absent, so
  _copy_reasoning_content_for_api's cross-provider leak guard (#15748),
  promote-from-'reasoning' tier, and thinking-pad tier remain live at
  replay time.
- No empty '' gets eagerly written on every assistant turn (which would
  have bypassed the read-side ladder and triggered empty thinking-block
  insertion in the Anthropic adapter).

Tests: three new TestBuildAssistantMessage cases covering the streaming
promotion path, SDK precedence, and field-absent-when-no-reasoning
invariant.

Credit @Sanjays2402 for the original diagnosis and patch in #16884;
this is a scoped rework that preserves the existing read-side
compensation code as defense in depth.

Refs #16844, #16884, #15250, #15353, #15748.
ulasbilgen pushed a commit to ulasbilgen/hermes-adhd-agent that referenced this pull request May 1, 2026
…usResearch#16844) (NousResearch#16892)

Streaming-only providers (glm, MiniMax, gpt-5.x via aigw, Anthropic via
openai-compat shims) emit reasoning through delta.reasoning_content
chunks that get accumulated into the local reasoning_text string — but
never land on the assistant message object as a top-level attribute. The
prior guard at _build_assistant_message only wrote reasoning_content
when the SDK exposed hasattr(msg, 'reasoning_content'), so these
providers persisted the chain-of-thought under the internal 'reasoning'
key and omitted the protocol-standard field.

The poison was silent until the user later switched to a DeepSeek-v4 or
Kimi thinking model, at which point replay failed with HTTP 400:
'The reasoning_content in the thinking mode must be passed back to the
API.' One reported session store accumulated 4,031 poisoned messages
across 1,101 files (NousResearch#16844).

Fix: add an additive fallback that promotes the already-sanitized
reasoning_text to reasoning_content when no earlier branch wrote it AND
reasoning text was actually captured. Layered on top of the existing
SDK-attr branch and DeepSeek ''-pad (NousResearch#15250) rather than replacing them,
so every existing behavior is preserved:

- SDK-exposed reasoning_content (OpenAI/Moonshot/DeepSeek SDK) still
  wins.
- DeepSeek tool-call ''-pad still fires when the SDK exposes the attr
  but the value is None.
- Non-thinking turns with no reasoning leave the field absent, so
  _copy_reasoning_content_for_api's cross-provider leak guard (NousResearch#15748),
  promote-from-'reasoning' tier, and thinking-pad tier remain live at
  replay time.
- No empty '' gets eagerly written on every assistant turn (which would
  have bypassed the read-side ladder and triggered empty thinking-block
  insertion in the Anthropic adapter).

Tests: three new TestBuildAssistantMessage cases covering the streaming
promotion path, SDK precedence, and field-absent-when-no-reasoning
invariant.

Credit @Sanjays2402 for the original diagnosis and patch in NousResearch#16884;
this is a scoped rework that preserves the existing read-side
compensation code as defense in depth.

Refs NousResearch#16844, NousResearch#16884, NousResearch#15250, NousResearch#15353, NousResearch#15748.
donald131 pushed a commit to donald131/hermes-agent that referenced this pull request May 2, 2026
…usResearch#16844) (NousResearch#16892)

Streaming-only providers (glm, MiniMax, gpt-5.x via aigw, Anthropic via
openai-compat shims) emit reasoning through delta.reasoning_content
chunks that get accumulated into the local reasoning_text string — but
never land on the assistant message object as a top-level attribute. The
prior guard at _build_assistant_message only wrote reasoning_content
when the SDK exposed hasattr(msg, 'reasoning_content'), so these
providers persisted the chain-of-thought under the internal 'reasoning'
key and omitted the protocol-standard field.

The poison was silent until the user later switched to a DeepSeek-v4 or
Kimi thinking model, at which point replay failed with HTTP 400:
'The reasoning_content in the thinking mode must be passed back to the
API.' One reported session store accumulated 4,031 poisoned messages
across 1,101 files (NousResearch#16844).

Fix: add an additive fallback that promotes the already-sanitized
reasoning_text to reasoning_content when no earlier branch wrote it AND
reasoning text was actually captured. Layered on top of the existing
SDK-attr branch and DeepSeek ''-pad (NousResearch#15250) rather than replacing them,
so every existing behavior is preserved:

- SDK-exposed reasoning_content (OpenAI/Moonshot/DeepSeek SDK) still
  wins.
- DeepSeek tool-call ''-pad still fires when the SDK exposes the attr
  but the value is None.
- Non-thinking turns with no reasoning leave the field absent, so
  _copy_reasoning_content_for_api's cross-provider leak guard (NousResearch#15748),
  promote-from-'reasoning' tier, and thinking-pad tier remain live at
  replay time.
- No empty '' gets eagerly written on every assistant turn (which would
  have bypassed the read-side ladder and triggered empty thinking-block
  insertion in the Anthropic adapter).

Tests: three new TestBuildAssistantMessage cases covering the streaming
promotion path, SDK precedence, and field-absent-when-no-reasoning
invariant.

Credit @Sanjays2402 for the original diagnosis and patch in NousResearch#16884;
this is a scoped rework that preserves the existing read-side
compensation code as defense in depth.

Refs NousResearch#16844, NousResearch#16884, NousResearch#15250, NousResearch#15353, NousResearch#15748.
02356abc pushed a commit to 02356abc/hermes-agent that referenced this pull request May 14, 2026
…usResearch#16844) (NousResearch#16892)

Streaming-only providers (glm, MiniMax, gpt-5.x via aigw, Anthropic via
openai-compat shims) emit reasoning through delta.reasoning_content
chunks that get accumulated into the local reasoning_text string — but
never land on the assistant message object as a top-level attribute. The
prior guard at _build_assistant_message only wrote reasoning_content
when the SDK exposed hasattr(msg, 'reasoning_content'), so these
providers persisted the chain-of-thought under the internal 'reasoning'
key and omitted the protocol-standard field.

The poison was silent until the user later switched to a DeepSeek-v4 or
Kimi thinking model, at which point replay failed with HTTP 400:
'The reasoning_content in the thinking mode must be passed back to the
API.' One reported session store accumulated 4,031 poisoned messages
across 1,101 files (NousResearch#16844).

Fix: add an additive fallback that promotes the already-sanitized
reasoning_text to reasoning_content when no earlier branch wrote it AND
reasoning text was actually captured. Layered on top of the existing
SDK-attr branch and DeepSeek ''-pad (NousResearch#15250) rather than replacing them,
so every existing behavior is preserved:

- SDK-exposed reasoning_content (OpenAI/Moonshot/DeepSeek SDK) still
  wins.
- DeepSeek tool-call ''-pad still fires when the SDK exposes the attr
  but the value is None.
- Non-thinking turns with no reasoning leave the field absent, so
  _copy_reasoning_content_for_api's cross-provider leak guard (NousResearch#15748),
  promote-from-'reasoning' tier, and thinking-pad tier remain live at
  replay time.
- No empty '' gets eagerly written on every assistant turn (which would
  have bypassed the read-side ladder and triggered empty thinking-block
  insertion in the Anthropic adapter).

Tests: three new TestBuildAssistantMessage cases covering the streaming
promotion path, SDK precedence, and field-absent-when-no-reasoning
invariant.

Credit @Sanjays2402 for the original diagnosis and patch in NousResearch#16884;
this is a scoped rework that preserves the existing read-side
compensation code as defense in depth.

Refs NousResearch#16844, NousResearch#16884, NousResearch#15250, NousResearch#15353, NousResearch#15748.
dannyJ848 pushed a commit to dannyJ848/hermes-agent that referenced this pull request May 17, 2026
…usResearch#16844) (NousResearch#16892)

Streaming-only providers (glm, MiniMax, gpt-5.x via aigw, Anthropic via
openai-compat shims) emit reasoning through delta.reasoning_content
chunks that get accumulated into the local reasoning_text string — but
never land on the assistant message object as a top-level attribute. The
prior guard at _build_assistant_message only wrote reasoning_content
when the SDK exposed hasattr(msg, 'reasoning_content'), so these
providers persisted the chain-of-thought under the internal 'reasoning'
key and omitted the protocol-standard field.

The poison was silent until the user later switched to a DeepSeek-v4 or
Kimi thinking model, at which point replay failed with HTTP 400:
'The reasoning_content in the thinking mode must be passed back to the
API.' One reported session store accumulated 4,031 poisoned messages
across 1,101 files (NousResearch#16844).

Fix: add an additive fallback that promotes the already-sanitized
reasoning_text to reasoning_content when no earlier branch wrote it AND
reasoning text was actually captured. Layered on top of the existing
SDK-attr branch and DeepSeek ''-pad (NousResearch#15250) rather than replacing them,
so every existing behavior is preserved:

- SDK-exposed reasoning_content (OpenAI/Moonshot/DeepSeek SDK) still
  wins.
- DeepSeek tool-call ''-pad still fires when the SDK exposes the attr
  but the value is None.
- Non-thinking turns with no reasoning leave the field absent, so
  _copy_reasoning_content_for_api's cross-provider leak guard (NousResearch#15748),
  promote-from-'reasoning' tier, and thinking-pad tier remain live at
  replay time.
- No empty '' gets eagerly written on every assistant turn (which would
  have bypassed the read-side ladder and triggered empty thinking-block
  insertion in the Anthropic adapter).

Tests: three new TestBuildAssistantMessage cases covering the streaming
promotion path, SDK precedence, and field-absent-when-no-reasoning
invariant.

Credit @Sanjays2402 for the original diagnosis and patch in NousResearch#16884;
this is a scoped rework that preserves the existing read-side
compensation code as defense in depth.

Refs NousResearch#16844, NousResearch#16884, NousResearch#15250, NousResearch#15353, NousResearch#15748.
gweeteve pushed a commit to gweeteve/hermes-agent that referenced this pull request Jun 2, 2026
…usResearch#16844) (NousResearch#16892)

Streaming-only providers (glm, MiniMax, gpt-5.x via aigw, Anthropic via
openai-compat shims) emit reasoning through delta.reasoning_content
chunks that get accumulated into the local reasoning_text string — but
never land on the assistant message object as a top-level attribute. The
prior guard at _build_assistant_message only wrote reasoning_content
when the SDK exposed hasattr(msg, 'reasoning_content'), so these
providers persisted the chain-of-thought under the internal 'reasoning'
key and omitted the protocol-standard field.

The poison was silent until the user later switched to a DeepSeek-v4 or
Kimi thinking model, at which point replay failed with HTTP 400:
'The reasoning_content in the thinking mode must be passed back to the
API.' One reported session store accumulated 4,031 poisoned messages
across 1,101 files (NousResearch#16844).

Fix: add an additive fallback that promotes the already-sanitized
reasoning_text to reasoning_content when no earlier branch wrote it AND
reasoning text was actually captured. Layered on top of the existing
SDK-attr branch and DeepSeek ''-pad (NousResearch#15250) rather than replacing them,
so every existing behavior is preserved:

- SDK-exposed reasoning_content (OpenAI/Moonshot/DeepSeek SDK) still
  wins.
- DeepSeek tool-call ''-pad still fires when the SDK exposes the attr
  but the value is None.
- Non-thinking turns with no reasoning leave the field absent, so
  _copy_reasoning_content_for_api's cross-provider leak guard (NousResearch#15748),
  promote-from-'reasoning' tier, and thinking-pad tier remain live at
  replay time.
- No empty '' gets eagerly written on every assistant turn (which would
  have bypassed the read-side ladder and triggered empty thinking-block
  insertion in the Anthropic adapter).

Tests: three new TestBuildAssistantMessage cases covering the streaming
promotion path, SDK precedence, and field-absent-when-no-reasoning
invariant.

Credit @Sanjays2402 for the original diagnosis and patch in NousResearch#16884;
this is a scoped rework that preserves the existing read-side
compensation code as defense in depth.

Refs NousResearch#16844, NousResearch#16884, NousResearch#15250, NousResearch#15353, NousResearch#15748.
Egavasyug pushed a commit to Egavasyug/hermes-agent that referenced this pull request Jun 10, 2026
…usResearch#16844) (NousResearch#16892)

Streaming-only providers (glm, MiniMax, gpt-5.x via aigw, Anthropic via
openai-compat shims) emit reasoning through delta.reasoning_content
chunks that get accumulated into the local reasoning_text string — but
never land on the assistant message object as a top-level attribute. The
prior guard at _build_assistant_message only wrote reasoning_content
when the SDK exposed hasattr(msg, 'reasoning_content'), so these
providers persisted the chain-of-thought under the internal 'reasoning'
key and omitted the protocol-standard field.

The poison was silent until the user later switched to a DeepSeek-v4 or
Kimi thinking model, at which point replay failed with HTTP 400:
'The reasoning_content in the thinking mode must be passed back to the
API.' One reported session store accumulated 4,031 poisoned messages
across 1,101 files (NousResearch#16844).

Fix: add an additive fallback that promotes the already-sanitized
reasoning_text to reasoning_content when no earlier branch wrote it AND
reasoning text was actually captured. Layered on top of the existing
SDK-attr branch and DeepSeek ''-pad (NousResearch#15250) rather than replacing them,
so every existing behavior is preserved:

- SDK-exposed reasoning_content (OpenAI/Moonshot/DeepSeek SDK) still
  wins.
- DeepSeek tool-call ''-pad still fires when the SDK exposes the attr
  but the value is None.
- Non-thinking turns with no reasoning leave the field absent, so
  _copy_reasoning_content_for_api's cross-provider leak guard (NousResearch#15748),
  promote-from-'reasoning' tier, and thinking-pad tier remain live at
  replay time.
- No empty '' gets eagerly written on every assistant turn (which would
  have bypassed the read-side ladder and triggered empty thinking-block
  insertion in the Anthropic adapter).

Tests: three new TestBuildAssistantMessage cases covering the streaming
promotion path, SDK precedence, and field-absent-when-no-reasoning
invariant.

Credit @Sanjays2402 for the original diagnosis and patch in NousResearch#16884;
this is a scoped rework that preserves the existing read-side
compensation code as defense in depth.

Refs NousResearch#16844, NousResearch#16884, NousResearch#15250, NousResearch#15353, NousResearch#15748.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder P1 High — major feature broken, no workaround type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants