[Bug]: extended-thinking + interrupted parallel tool batch → non-retryable HTTP 400 crash-loop (stale thinking signature)

### Bug Description

Extended-thinking Claude models (4.6+, e.g. Opus 4.8) crash-loop the gateway with a **non-retryable HTTP 400** when a parallel tool batch is interrupted before every tool result returns.

These models emit a **signed `thinking` block** on assistant turns that also fire `tool_use` blocks. Anthropic signs that block against the *full, original* turn content, and on replay it must be passed back byte-for-byte. `agent/anthropic_adapter.py::_strip_orphaned_tool_blocks()` legitimately strips a `tool_use` whose matching `tool_result` never arrived (parallel batch interrupted, context compression, or session truncation) — but that **mutates the latest assistant turn**, and `_manage_thinking_signatures()` then replays the now-stale signed thinking block verbatim. Anthropic rejects it:

```
messages.N.content.M: `thinking` or `redacted_thinking` blocks in the latest
assistant message cannot be modified. These blocks must remain as they were
in the original response.
```

The 400 is classified non-retryable, so the gateway reloads the same poisoned transcript from the persisted store **every turn** → infinite crash-loop with no self-recovery (a soft session reset does not clear it, because history is rebuilt from the store). The drifting `content.M` index in the error is just the changing count of stripped `tool_use` blocks across rebuilds.

### Steps to Reproduce

1. Run the gateway with an extended-thinking model (e.g. `claude-opus-4-8` on the native Anthropic endpoint, extended thinking ON).
2. Send a message whose first turn fires a **large parallel tool batch** with thinking enabled.
3. Have the batch interrupted before every `tool_result` comes back (e.g. /stop, an error on one tool, or compression mid-flight).
4. On the next turn the orphaned `tool_use` is stripped, the signed thinking block's signature no longer matches, and Anthropic 400s.
5. The gateway loops on the same error indefinitely.

Minimal unit reproduction (no network):

```python
from agent.anthropic_adapter import convert_messages_to_anthropic
messages = [
    {"role": "assistant", "content": "",
     "tool_calls": [
         {"id": "tc_kept",   "function": {"name": "a", "arguments": "{}"}},
         {"id": "tc_orphan", "function": {"name": "b", "arguments": "{}"}},
     ],
     "reasoning_details": [
         {"type": "thinking", "thinking": "plan", "signature": "sig"},
     ]},
    {"role": "tool", "tool_call_id": "tc_kept", "content": "result A"},
]
_, result = convert_messages_to_anthropic(messages)
# Before the fix: the latest assistant turn still carries the signed `thinking`
# block whose signature was computed over the original (un-stripped) 3-block turn
# → Anthropic 400 on replay.
```

### Expected Behavior

When a structural mutation (orphan-strip / merge / truncation) invalidates a thinking-block signature on the latest assistant turn, Hermes should either preserve the block verbatim (impossible — the block was mutated) or **demote it to plain text** so the turn replays cleanly and the model can re-plan. The gateway must not enter a non-retryable crash-loop.

### Actual Behavior

The stale signed `thinking` block is replayed verbatim, Anthropic returns a non-retryable HTTP 400 ("blocks in the latest assistant message cannot be modified"), and the gateway crash-loops because the poisoned transcript is rebuilt from the store on every turn. No self-recovery; a soft session reset does not clear it.

### Affected Component

Agent Core (conversation loop, context compression, memory)

### Messaging Platform (if gateway-related)

N/A (CLI only)

### Operating System

macOS 26.4 (Apple Silicon)

### Python Version

3.11.15

### Hermes Version

v0.15.1 (2026.5.29)

### Root Cause Analysis

`agent/anthropic_adapter.py`:

- `_strip_orphaned_tool_blocks()` removes orphaned `tool_use` blocks from assistant turns (correct), but does not account for the fact that a co-located `thinking`/`redacted_thinking` block's Anthropic signature was computed over the *original* turn content and is now dead.
- `_manage_thinking_signatures()` latest-assistant branch then replays any block with a `signature` verbatim — including the now-invalid one — producing the 400.

The two functions are individually correct but compose into the bug: strip mutates, signature-management trusts the mutated turn's signature.

### Proposed Fix

Flag the turn in `_strip_orphaned_tool_blocks()` when stripping a `tool_use` from a turn that also holds a thinking block; propagate that flag through `_merge_consecutive_roles()`; and in `_manage_thinking_signatures()` demote all thinking blocks on a flagged latest turn to text (preserving the reasoning) instead of replaying a dead signature. Intact turns are unaffected.

**PR ready:** #35846 (with two regression tests).

### Are you willing to submit a PR for this?

- [x] I'd like to fix this myself and submit a PR


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: extended-thinking + interrupted parallel tool batch → non-retryable HTTP 400 crash-loop (stale thinking signature) #35847

Bug Description

Steps to Reproduce

Expected Behavior

Actual Behavior

Affected Component

Messaging Platform (if gateway-related)

Operating System

Python Version

Hermes Version

Root Cause Analysis

Proposed Fix

Are you willing to submit a PR for this?

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[Bug]: extended-thinking + interrupted parallel tool batch → non-retryable HTTP 400 crash-loop (stale thinking signature) #35847

Description

Bug Description

Steps to Reproduce

Expected Behavior

Actual Behavior

Affected Component

Messaging Platform (if gateway-related)

Operating System

Python Version

Hermes Version

Root Cause Analysis

Proposed Fix

Are you willing to submit a PR for this?

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions