[Bug]: persisted assistant messages store reasoning in reasoning (internal) instead of reasoning_content, leaving sessions silently poisoned for any future DeepSeek/Kimi thinking-mode replay
Summary
run_agent.py writes assistant turns to disk with the chain-of-thought stored under the internal field name reasoning, not the protocol-standard reasoning_content. The standard field is only persisted when the upstream SDK object happens to expose assistant_message.reasoning_content, which is provider-dependent. For most non-DeepSeek providers (GLM, MiniMax, GPT‑5.x via aigw / OpenAI Chat Completions wrappers) the field never gets written.
This means every assistant tool-call turn produced by those providers is silently poisoned at write time. The poison is invisible until the user later switches to a DeepSeek‑v4 / Kimi thinking model — which strictly requires reasoning_content on every replayed assistant turn — at which point HTTP 400 fires:
The reasoning_content in the thinking mode must be passed back to the API.
The recently merged read-side guards (#15213, #15741, #15748, #15353) all attempt to compensate at request-build time. They each fix one build path. But the underlying schema mismatch on disk means every new build path is a candidate for the same 400, and any session created by another provider becomes a latent bomb the moment the user switches model.
This issue is about the write side, not the read side. The proposal is to normalize the field name at persistence time so the read-side compensation code is unnecessary.
Those issues all describe a single read path that fails to copy or inject reasoning_content when building the next API request. Each fix patches one path:
This issue identifies the upstream cause: assistant messages are persisted with the wrong field name, so every read path has to reinvent a "promote reasoning → reasoning_content (or inject "")" dance. Any code path that omits the dance — present or future — will fail.
The cumulative evidence below shows this is not theoretical: a single user's session store accumulated 4 031 poisoned messages across 1 101 session files, every one of which would 400 on DeepSeek replay despite all four landed fixes being present in tree.
Forensic data from a real install
Hermes Agent v0.11.0 (2026.4.23). After encountering the 400 with provider=custom, model=deepseek-v4-pro against https://aigw.netease.com/v1, I scanned the full session store at ~/.hermes/sessions/ and ~/.hermes/profiles/*/sessions/:
Scanned files : 1 497
Files with poison : 1 101 (assistant + tool_calls + missing reasoning_content)
Poisoned msgs total : 4 031
Breakdown of the 4 031 poisoned messages:
By session.model (top entries):
| count |
model |
| 3 651 |
glm-5.1 |
| 272 |
MiniMax-M2.7 |
| 74 |
gpt-5.4 |
| 21 |
MiniMax-M2.7-highspeed |
| 11 |
claude-opus-4-6 |
| 2 |
deepseek-v4-pro |
By message structure:
| signal |
value |
meaning |
has internal reasoning field, non-empty string |
3 603 / 4 031 (89%) |
hermes captured the chain of thought, just under the wrong key |
| no reasoning at all |
428 / 4 031 (11%) |
message stored without any reasoning info |
finish_reason == "tool_calls" |
3 501 / 4 031 (87%) |
classic tool-call termination |
empty content |
3 027 / 4 031 (75%) |
pure function-call turns |
Sample poisoned message (from a cron job that ran 2026-04-26 under glm-5.1):
{
"role": "assistant",
"content": "",
"reasoning": "Let me analyze the health check output:\n\n- CRIT: 0\n- WARN: 1 - gateway_state hasn't been updated for over 27 hours (pid=75659)\n\nI need to investigate this warning about the gateway process. Let me che…",
"finish_reason": "tool_calls",
"tool_calls": [ … 2 calls … ]
}
Note: the chain of thought was captured (267 chars under reasoning). It just isn't written under the name DeepSeek requires.
Root cause in code
run_agent.py (around line 7755):
msg = {
...
"reasoning": reasoning_text, # internal canonical name — always written
"finish_reason": finish_reason,
}
if hasattr(assistant_message, "reasoning_content"):
raw = getattr(assistant_message, "reasoning_content", None)
if raw is not None:
msg["reasoning_content"] = _sanitize_surrogates(raw) # only when SDK exposed it
elif msg.get("tool_calls") and self._needs_deepseek_tool_reasoning():
msg["reasoning_content"] = "" # narrow guard, only when current provider is DeepSeek at write time
Two failure modes:
- The non-DeepSeek SDK object often doesn't expose
reasoning_content as a top-level attribute (the data lives under delta.reasoning_content in streaming chunks, accumulated into the local variable reasoning_text, and then written only to the internal "reasoning" key). The standard field never lands on disk.
- The
_needs_deepseek_tool_reasoning() guard only fires when the current provider is DeepSeek. If the message is being written under glm/minimax/gpt and the user later switches to DeepSeek, the guard never ran when it would have helped.
The read-side _copy_reasoning_content_for_api does have a path that promotes reasoning → reasoning_content, and after #15748's reordering it does the right thing on the main loop. But every new code path that builds an API request from history (cron, fallback switch, auxiliary clients, ACP adapter, gateway replay, transports/chat_completions, transports/bedrock) is a fresh place where the dance can be forgotten — and #15213 / #15741 are evidence that this happens.
Reproduction
- Hermes v0.11.0, any non-DeepSeek thinking model that emits reasoning via
delta.reasoning_content in streaming (e.g. glm-5.1 over an aigw or zhipu endpoint).
- Have at least one tool-call turn in the conversation.
- Inspect the persisted session JSON — the assistant turn will have
"reasoning": "…" but no "reasoning_content" key.
- Switch the same session (or a new run that loads accumulated context, e.g. cron with persistent session, or an a2a sub-agent) to
deepseek-v4-pro / deepseek-v4-flash.
- The next API request that replays the message returns HTTP 400.
In my install this happened at message ~100 of a session that had been growing for a day under glm-5.1, the moment the fallback chain promoted DeepSeek to primary.
Suggested fix
Normalize at write time, not at read time.
In the persistence path that builds the assistant message dict, write the chain of thought to reasoning_content directly (which is the standard cross-provider name; the SDK ecosystem has effectively converged on this), and either drop the reasoning alias or keep both for one release for backward compat.
Concretely: at the point where reasoning_text is finalized for the message, write:
msg["reasoning_content"] = _sanitize_surrogates(reasoning_text or "")
unconditionally for assistant turns. The empty string is the safest default — DeepSeek/Kimi accept it, every other provider ignores unknown empty fields, and the read side no longer needs to compensate.
This makes the four landed read-side fixes redundant safety nets rather than mandatory promotion paths, and prevents the same class of bug from recurring in future build paths.
Defense-in-depth (optional)
A startup-time migration that scans ~/.hermes/sessions/**/*.json and adds reasoning_content: "" (or copies from reasoning) on any assistant turn missing it would clean the existing fleet. I wrote one for my install — happy to PR it if useful. It found and repaired the 4 031 messages above; total run time on 1 497 files was under 10 seconds.
Workaround for affected users
Until the write side is fixed, two things have to be done together:
hermes config set agent.reasoning_effort none (stops new poisoned writes when DeepSeek is primary)
- Run a one-time repair across the session store to inject
reasoning_content: "" on every poisoned message — otherwise switching to DeepSeek at any later date re-triggers the 400.
(1) alone is not enough. (2) alone gets re-poisoned the next time a non-DeepSeek provider is used.
Environment
- Hermes Agent v0.11.0 (2026.4.23)
- Python 3.14.3
- openai 2.26.0
- macOS 26.4.1 (Darwin 25.4)
- Provider: custom, base_url
https://aigw.netease.com/v1
- Affected models observed:
deepseek-v4-pro (failing), glm-5.1 / MiniMax-M2.7 / gpt-5.4 / claude-opus-4-6 (poisoning sources)
Related
All of the above are read-side fixes. This issue proposes a write-side fix that makes them unnecessary going forward.
[Bug]: persisted assistant messages store reasoning in
reasoning(internal) instead ofreasoning_content, leaving sessions silently poisoned for any future DeepSeek/Kimi thinking-mode replaySummary
run_agent.pywrites assistant turns to disk with the chain-of-thought stored under the internal field namereasoning, not the protocol-standardreasoning_content. The standard field is only persisted when the upstream SDK object happens to exposeassistant_message.reasoning_content, which is provider-dependent. For most non-DeepSeek providers (GLM, MiniMax, GPT‑5.x via aigw / OpenAI Chat Completions wrappers) the field never gets written.This means every assistant tool-call turn produced by those providers is silently poisoned at write time. The poison is invisible until the user later switches to a DeepSeek‑v4 / Kimi thinking model — which strictly requires
reasoning_contenton every replayed assistant turn — at which point HTTP 400 fires:The recently merged read-side guards (#15213, #15741, #15748, #15353) all attempt to compensate at request-build time. They each fix one build path. But the underlying schema mismatch on disk means every new build path is a candidate for the same 400, and any session created by another provider becomes a latent bomb the moment the user switches model.
This issue is about the write side, not the read side. The proposal is to normalize the field name at persistence time so the read-side compensation code is unnecessary.
Why this is distinct from #15213 / #15741 / #15748 / #15353
Those issues all describe a single read path that fails to copy or inject
reasoning_contentwhen building the next API request. Each fix patches one path:_copy_reasoning_content_for_apiordering bug (cross-provider leak)This issue identifies the upstream cause: assistant messages are persisted with the wrong field name, so every read path has to reinvent a "promote
reasoning→reasoning_content(or inject"")" dance. Any code path that omits the dance — present or future — will fail.The cumulative evidence below shows this is not theoretical: a single user's session store accumulated 4 031 poisoned messages across 1 101 session files, every one of which would 400 on DeepSeek replay despite all four landed fixes being present in tree.
Forensic data from a real install
Hermes Agent v0.11.0 (2026.4.23). After encountering the 400 with
provider=custom, model=deepseek-v4-proagainsthttps://aigw.netease.com/v1, I scanned the full session store at~/.hermes/sessions/and~/.hermes/profiles/*/sessions/:Breakdown of the 4 031 poisoned messages:
By session.model (top entries):
glm-5.1MiniMax-M2.7gpt-5.4MiniMax-M2.7-highspeedclaude-opus-4-6deepseek-v4-proBy message structure:
reasoningfield, non-empty stringfinish_reason == "tool_calls"contentSample poisoned message (from a cron job that ran 2026-04-26 under
glm-5.1):{ "role": "assistant", "content": "", "reasoning": "Let me analyze the health check output:\n\n- CRIT: 0\n- WARN: 1 - gateway_state hasn't been updated for over 27 hours (pid=75659)\n\nI need to investigate this warning about the gateway process. Let me che…", "finish_reason": "tool_calls", "tool_calls": [ … 2 calls … ] }Note: the chain of thought was captured (267 chars under
reasoning). It just isn't written under the name DeepSeek requires.Root cause in code
run_agent.py(around line 7755):Two failure modes:
reasoning_contentas a top-level attribute (the data lives underdelta.reasoning_contentin streaming chunks, accumulated into the local variablereasoning_text, and then written only to the internal"reasoning"key). The standard field never lands on disk._needs_deepseek_tool_reasoning()guard only fires when the current provider is DeepSeek. If the message is being written under glm/minimax/gpt and the user later switches to DeepSeek, the guard never ran when it would have helped.The read-side
_copy_reasoning_content_for_apidoes have a path that promotesreasoning→reasoning_content, and after #15748's reordering it does the right thing on the main loop. But every new code path that builds an API request from history (cron, fallback switch, auxiliary clients, ACP adapter, gateway replay, transports/chat_completions, transports/bedrock) is a fresh place where the dance can be forgotten — and #15213 / #15741 are evidence that this happens.Reproduction
delta.reasoning_contentin streaming (e.g.glm-5.1over an aigw or zhipu endpoint)."reasoning": "…"but no"reasoning_content"key.deepseek-v4-pro/deepseek-v4-flash.In my install this happened at message ~100 of a session that had been growing for a day under glm-5.1, the moment the fallback chain promoted DeepSeek to primary.
Suggested fix
Normalize at write time, not at read time.
In the persistence path that builds the assistant message dict, write the chain of thought to
reasoning_contentdirectly (which is the standard cross-provider name; the SDK ecosystem has effectively converged on this), and either drop thereasoningalias or keep both for one release for backward compat.Concretely: at the point where
reasoning_textis finalized for the message, write:unconditionally for assistant turns. The empty string is the safest default — DeepSeek/Kimi accept it, every other provider ignores unknown empty fields, and the read side no longer needs to compensate.
This makes the four landed read-side fixes redundant safety nets rather than mandatory promotion paths, and prevents the same class of bug from recurring in future build paths.
Defense-in-depth (optional)
A startup-time migration that scans
~/.hermes/sessions/**/*.jsonand addsreasoning_content: ""(or copies fromreasoning) on any assistant turn missing it would clean the existing fleet. I wrote one for my install — happy to PR it if useful. It found and repaired the 4 031 messages above; total run time on 1 497 files was under 10 seconds.Workaround for affected users
Until the write side is fixed, two things have to be done together:
hermes config set agent.reasoning_effort none(stops new poisoned writes when DeepSeek is primary)reasoning_content: ""on every poisoned message — otherwise switching to DeepSeek at any later date re-triggers the 400.(1) alone is not enough. (2) alone gets re-poisoned the next time a non-DeepSeek provider is used.
Environment
https://aigw.netease.com/v1deepseek-v4-pro(failing),glm-5.1/MiniMax-M2.7/gpt-5.4/claude-opus-4-6(poisoning sources)Related
_copy_reasoning_content_for_apiordering bug (closed)All of the above are read-side fixes. This issue proposes a write-side fix that makes them unnecessary going forward.