Skip to content

fix(anthropic): demote dead thinking signature when orphan-strip mutates the latest turn#35859

Merged
teknium1 merged 2 commits into
mainfrom
hermes/hermes-b941b493
May 31, 2026
Merged

fix(anthropic): demote dead thinking signature when orphan-strip mutates the latest turn#35859
teknium1 merged 2 commits into
mainfrom
hermes/hermes-b941b493

Conversation

@teknium1

Copy link
Copy Markdown
Contributor

Summary

Extended-thinking Claude models (Opus 4.8 etc.) no longer crash-loop the gateway with a non-retryable HTTP 400 when a parallel tool batch is interrupted before every tool result returns.

Root cause: _strip_orphaned_tool_blocks() legitimately removes a tool_use whose matching tool_result never arrived, but that mutates the latest assistant turn. _manage_thinking_signatures() then replays the signed thinking block verbatim — its Anthropic signature was computed over the original (un-stripped) turn, so Anthropic rejects it: thinking blocks in the latest assistant message cannot be modified. The 400 is non-retryable and the transcript is rebuilt from the store every turn → infinite loop.

Salvage of #35846 by @fesalfayed onto current main (cherry-picked, authorship preserved). Fixes #35847.

Changes

  • agent/anthropic_adapter.py: flag a turn _thinking_signature_invalidated when orphan-strip mutates a thinking-bearing turn; propagate the flag through _merge_consecutive_roles; in _manage_thinking_signatures demote the dead signed thinking block to a plain text block (reasoning preserved) instead of replaying it. Intact turns keep their signed thinking verbatim. Internal flag stripped before payload send.
  • tests/agent/test_anthropic_adapter.py: regression + control test.
  • scripts/release.py: AUTHOR_MAP entry for the contributor email (our follow-up commit).

Validation

Before After
Orphaned parallel tool_use stripped, latest turn signed thinking replayed → HTTP 400 crash-loop demoted to text, reasoning preserved, answered tool_use survives
Intact latest turn signed thinking verbatim signed thinking verbatim (fix does not over-fire)

Targeted suite: 23 passed (-k "thinking or signature or orphan or merge or preserved or redacted"). E2E on real convert_messages_to_anthropic (no mocks) confirms both cases.

Infographic

dead-thinking-signature-demote

fesalfayed and others added 2 commits May 31, 2026 05:36
…tes the latest turn

Extended-thinking Claude models (4.6+, e.g. Opus 4.8) emit a signed `thinking`
block on assistant turns that also carry parallel `tool_use` blocks. Anthropic
signs that block against the full, original turn content.

When a parallel tool batch is interrupted before every `tool_result` returns,
`_strip_orphaned_tool_blocks` removes the unanswered `tool_use` on replay — which
mutates the turn. The latest-assistant branch of `_manage_thinking_signatures`
then replays the now-stale signed thinking block verbatim, and Anthropic rejects
the request with a non-retryable HTTP 400:

    messages.N.content.M: `thinking` or `redacted_thinking` blocks in the latest
    assistant message cannot be modified. These blocks must remain as they were
    in the original response.

Because the poisoned turn is rebuilt from the persisted store every turn, the
gateway crash-loops with no self-recovery (a soft session reset does not clear
it). The drifting content index in the error is the changing count of stripped
`tool_use` blocks across rebuilds.

Fix: when orphan-stripping removes a `tool_use` from a turn that also holds a
thinking/redacted_thinking block, flag the turn. `_manage_thinking_signatures`
then demotes every thinking block on that latest turn to a plain text block
(preserving the reasoning text) instead of replaying a signature that can no
longer validate. An intact turn is unaffected — its signed thinking is still
replayed verbatim. The internal flag is stripped before the payload is sent.

Adds two regression tests:
- demotion when an orphaned parallel tool_use is stripped
- control: signed thinking preserved verbatim when nothing is stripped
@github-actions

Copy link
Copy Markdown
Contributor

🔎 Lint report: hermes/hermes-b941b493 vs origin/main

ruff

Total: 0 on HEAD, 0 on base (➖ 0)

🆕 New issues: none

✅ Fixed issues: none

Unchanged: 0 pre-existing issues carried over.

ty (type checker)

Total: 9542 on HEAD, 9542 on base (➖ 0)

🆕 New issues: none

✅ Fixed issues: none

Unchanged: 4950 pre-existing issues carried over.

Diagnostics are surfaced as warnings — this check never fails the build.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder P1 High — major feature broken, no workaround provider/anthropic Anthropic native Messages API type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: extended-thinking + interrupted parallel tool batch → non-retryable HTTP 400 crash-loop (stale thinking signature)

3 participants