[Bug]: persisted assistant messages store reasoning in 'reasoning' (internal) instead of 'reasoning_content', leaving sessions silently poisoned for any future DeepSeek/Kimi thinking-mode replay

# [Bug]: persisted assistant messages store reasoning in `reasoning` (internal) instead of `reasoning_content`, leaving sessions silently poisoned for any future DeepSeek/Kimi thinking-mode replay

## Summary

`run_agent.py` writes assistant turns to disk with the chain-of-thought stored under the **internal** field name `reasoning`, not the protocol-standard `reasoning_content`. The standard field is only persisted when the upstream SDK object happens to expose `assistant_message.reasoning_content`, which is provider-dependent. For most non-DeepSeek providers (GLM, MiniMax, GPT‑5.x via aigw / OpenAI Chat Completions wrappers) the field never gets written.

This means every assistant tool-call turn produced by those providers is **silently poisoned at write time**. The poison is invisible until the user later switches to a DeepSeek‑v4 / Kimi thinking model — which strictly requires `reasoning_content` on every replayed assistant turn — at which point HTTP 400 fires:

> `The reasoning_content in the thinking mode must be passed back to the API.`

The recently merged read-side guards (#15213, #15741, #15748, #15353) all attempt to compensate at request-build time. They each fix one build path. But **the underlying schema mismatch on disk means every new build path is a candidate for the same 400**, and any session created by another provider becomes a latent bomb the moment the user switches model.

This issue is about the **write side**, not the read side. The proposal is to normalize the field name at persistence time so the read-side compensation code is unnecessary.

## Why this is distinct from #15213 / #15741 / #15748 / #15353

Those issues all describe a single read path that fails to copy or inject `reasoning_content` when building the next API request. Each fix patches one path:

- #15213 — main loop after auxiliary calls
- #15741 — cron path primary tool-result handoff
- #15748 — `_copy_reasoning_content_for_api` ordering bug (cross-provider leak)
- #15353 / #15250 / #15717 — earlier surface symptoms

This issue identifies the **upstream cause**: assistant messages are persisted with the wrong field name, so every read path has to reinvent a "promote `reasoning` → `reasoning_content` (or inject `""`)" dance. Any code path that omits the dance — present or future — will fail.

The cumulative evidence below shows this is not theoretical: a single user's session store accumulated **4 031 poisoned messages across 1 101 session files**, every one of which would 400 on DeepSeek replay despite all four landed fixes being present in tree.

## Forensic data from a real install

Hermes Agent v0.11.0 (2026.4.23). After encountering the 400 with `provider=custom, model=deepseek-v4-pro` against `https://aigw.netease.com/v1`, I scanned the full session store at `~/.hermes/sessions/` and `~/.hermes/profiles/*/sessions/`:

```
Scanned files       : 1 497
Files with poison   : 1 101   (assistant + tool_calls + missing reasoning_content)
Poisoned msgs total : 4 031
```

Breakdown of the 4 031 poisoned messages:

**By session.model** (top entries):

| count | model |
|------:|---|
| 3 651 | `glm-5.1` |
| 272 | `MiniMax-M2.7` |
| 74 | `gpt-5.4` |
| 21 | `MiniMax-M2.7-highspeed` |
| 11 | `claude-opus-4-6` |
| 2 | `deepseek-v4-pro` |

**By message structure**:

| signal | value | meaning |
|---|---:|---|
| has internal `reasoning` field, non-empty string | 3 603 / 4 031 (89%) | hermes captured the chain of thought, just under the wrong key |
| no reasoning at all | 428 / 4 031 (11%) | message stored without any reasoning info |
| `finish_reason == "tool_calls"` | 3 501 / 4 031 (87%) | classic tool-call termination |
| empty `content` | 3 027 / 4 031 (75%) | pure function-call turns |

**Sample poisoned message** (from a cron job that ran 2026-04-26 under `glm-5.1`):

```json
{
  "role": "assistant",
  "content": "",
  "reasoning": "Let me analyze the health check output:\n\n- CRIT: 0\n- WARN: 1 - gateway_state hasn't been updated for over 27 hours (pid=75659)\n\nI need to investigate this warning about the gateway process. Let me che…",
  "finish_reason": "tool_calls",
  "tool_calls": [ … 2 calls … ]
}
```

Note: the chain of thought **was captured** (267 chars under `reasoning`). It just isn't written under the name DeepSeek requires.

## Root cause in code

`run_agent.py` (around line 7755):

```python
msg = {
    ...
    "reasoning": reasoning_text,        # internal canonical name — always written
    "finish_reason": finish_reason,
}
if hasattr(assistant_message, "reasoning_content"):
    raw = getattr(assistant_message, "reasoning_content", None)
    if raw is not None:
        msg["reasoning_content"] = _sanitize_surrogates(raw)   # only when SDK exposed it
    elif msg.get("tool_calls") and self._needs_deepseek_tool_reasoning():
        msg["reasoning_content"] = ""                           # narrow guard, only when current provider is DeepSeek at write time
```

Two failure modes:

1. The non-DeepSeek SDK object often doesn't expose `reasoning_content` as a top-level attribute (the data lives under `delta.reasoning_content` in streaming chunks, accumulated into the local variable `reasoning_text`, and then written only to the internal `"reasoning"` key). The standard field never lands on disk.
2. The `_needs_deepseek_tool_reasoning()` guard only fires when the **current** provider is DeepSeek. If the message is being written under glm/minimax/gpt and the user later switches to DeepSeek, the guard never ran when it would have helped.

The read-side `_copy_reasoning_content_for_api` does have a path that promotes `reasoning` → `reasoning_content`, and after #15748's reordering it does the right thing on the main loop. But every new code path that builds an API request from history (cron, fallback switch, auxiliary clients, ACP adapter, gateway replay, transports/chat_completions, transports/bedrock) is a fresh place where the dance can be forgotten — and #15213 / #15741 are evidence that this happens.

## Reproduction

1. Hermes v0.11.0, any non-DeepSeek thinking model that emits reasoning via `delta.reasoning_content` in streaming (e.g. `glm-5.1` over an aigw or zhipu endpoint).
2. Have at least one tool-call turn in the conversation.
3. Inspect the persisted session JSON — the assistant turn will have `"reasoning": "…"` but no `"reasoning_content"` key.
4. Switch the same session (or a new run that loads accumulated context, e.g. cron with persistent session, or an a2a sub-agent) to `deepseek-v4-pro` / `deepseek-v4-flash`.
5. The next API request that replays the message returns HTTP 400.

In my install this happened at message ~100 of a session that had been growing for a day under glm-5.1, the moment the fallback chain promoted DeepSeek to primary.

## Suggested fix

**Normalize at write time, not at read time.**

In the persistence path that builds the assistant message dict, write the chain of thought to `reasoning_content` directly (which is the standard cross-provider name; the SDK ecosystem has effectively converged on this), and either drop the `reasoning` alias or keep both for one release for backward compat.

Concretely: at the point where `reasoning_text` is finalized for the message, write:

```python
msg["reasoning_content"] = _sanitize_surrogates(reasoning_text or "")
```

unconditionally for assistant turns. The empty string is the safest default — DeepSeek/Kimi accept it, every other provider ignores unknown empty fields, and the read side no longer needs to compensate.

This makes the four landed read-side fixes redundant safety nets rather than mandatory promotion paths, and prevents the same class of bug from recurring in future build paths.

### Defense-in-depth (optional)

A startup-time migration that scans `~/.hermes/sessions/**/*.json` and adds `reasoning_content: ""` (or copies from `reasoning`) on any assistant turn missing it would clean the existing fleet. I wrote one for my install — happy to PR it if useful. It found and repaired the 4 031 messages above; total run time on 1 497 files was under 10 seconds.

## Workaround for affected users

Until the write side is fixed, two things have to be done together:

1. `hermes config set agent.reasoning_effort none` (stops new poisoned writes when DeepSeek is primary)
2. Run a one-time repair across the session store to inject `reasoning_content: ""` on every poisoned message — otherwise switching to DeepSeek at any later date re-triggers the 400.

(1) alone is not enough. (2) alone gets re-poisoned the next time a non-DeepSeek provider is used.

## Environment

- Hermes Agent v0.11.0 (2026.4.23)
- Python 3.14.3
- openai 2.26.0
- macOS 26.4.1 (Darwin 25.4)
- Provider: custom, base_url `https://aigw.netease.com/v1`
- Affected models observed: `deepseek-v4-pro` (failing), `glm-5.1` / `MiniMax-M2.7` / `gpt-5.4` / `claude-opus-4-6` (poisoning sources)

## Related

- #15213 — main-loop reasoning_content guard (closed)
- #15250 — initial DeepSeek V4 Flash session-poisoning report (closed)
- #15353 — DeepSeek V4 thinking-mode tool-call 400 (closed)
- #15407 — merged
- #15478 — open
- #15717 — generic 400 surface bug (closed)
- #15741 — cron-path follow-up after #15213 closure (closed)
- #15748 — `_copy_reasoning_content_for_api` ordering bug (closed)

All of the above are read-side fixes. This issue proposes a write-side fix that makes them unnecessary going forward.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: persisted assistant messages store reasoning in 'reasoning' (internal) instead of 'reasoning_content', leaving sessions silently poisoned for any future DeepSeek/Kimi thinking-mode replay #16844

[Bug]: persisted assistant messages store reasoning in `reasoning` (internal) instead of `reasoning_content`, leaving sessions silently poisoned for any future DeepSeek/Kimi thinking-mode replay

Summary

Why this is distinct from #15213 / #15741 / #15748 / #15353

Forensic data from a real install

Root cause in code

Reproduction

Suggested fix

Defense-in-depth (optional)

Workaround for affected users

Environment

Related

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

count	model
3 651	`glm-5.1`
272	`MiniMax-M2.7`
74	`gpt-5.4`
21	`MiniMax-M2.7-highspeed`
11	`claude-opus-4-6`
2	`deepseek-v4-pro`

signal	value	meaning
has internal `reasoning` field, non-empty string	3 603 / 4 031 (89%)	hermes captured the chain of thought, just under the wrong key
no reasoning at all	428 / 4 031 (11%)	message stored without any reasoning info
`finish_reason == "tool_calls"`	3 501 / 4 031 (87%)	classic tool-call termination
empty `content`	3 027 / 4 031 (75%)	pure function-call turns

[Bug]: persisted assistant messages store reasoning in 'reasoning' (internal) instead of 'reasoning_content', leaving sessions silently poisoned for any future DeepSeek/Kimi thinking-mode replay #16844

Description

[Bug]: persisted assistant messages store reasoning in reasoning (internal) instead of reasoning_content, leaving sessions silently poisoned for any future DeepSeek/Kimi thinking-mode replay

Summary

Why this is distinct from #15213 / #15741 / #15748 / #15353

Forensic data from a real install

Root cause in code

Reproduction

Suggested fix

Defense-in-depth (optional)

Workaround for affected users

Environment

Related

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

[Bug]: persisted assistant messages store reasoning in `reasoning` (internal) instead of `reasoning_content`, leaving sessions silently poisoned for any future DeepSeek/Kimi thinking-mode replay