You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Extended-thinking Claude models (Opus 4.8+) sign each thinking block against its original position in the response. When Hermes separated blocks into reasoning_details and tool_calls and replayed them as [all thinking][text][all tool_use], the positions no longer matched the signatures, causing a non-retryable HTTP 400 on every subsequent turn -- an unrecoverable crash-loop.
normalize_response() in agent/transports/anthropic.py now records the original interleaved block order in provider_data['_anthropic_interleaved_order'] (only when there are multiple thinking blocks alongside tool_use blocks, i.e. when reordering would actually invalidate signatures).
_convert_assistant_message() in agent/anthropic_adapter.py uses this order when present to reconstruct the exact original sequence (e.g. [thinking, tool_use, thinking, tool_use, ...]), with a safe fallback to the original behaviour for messages that lack the field.
Files changed
agent/transports/anthropic.py -- track interleaved order during normalize_response()
agent/transports/types.py -- expose anthropic_interleaved_order property on NormalizedResponse
agent/chat_completion_helpers.py -- propagate _anthropic_interleaved_order into the stored message dict
agent/anthropic_adapter.py -- use interleaved order in _convert_assistant_message() to reconstruct faithful block sequence
Test plan
Multi-turn conversation with an extended-thinking Anthropic model (Opus 4.8+) that produces both thinking blocks and tool_use blocks in the same response -- subsequent turns should not receive HTTP 400
Single thinking block with no tool calls -- falls back to original path, no regression
Tool calls with no thinking blocks -- falls back to original path, no regression
The should_defer_preflight_to_real_usage() guard reads self.awaiting_real_usage_after_compression (line ~717), but this attribute is only assigned in compress() (line ~2087). It is never initialized in __init__.
On a fresh agent session where should_defer_preflight_to_real_usage() is called before the first compress(), this will raise:
AttributeError: 'ContextCompressor' object has no attribute 'awaiting_real_usage_after_compression'
The tests don't catch this because they explicitly set compressor.awaiting_real_usage_after_compression = True before the assertion.
Fix: add to __init__:
self.awaiting_real_usage_after_compression=False
The interleaved thinking+tool_use preservation logic (anthropic adapter, transport, types) looks correct and well-tested. The _anthropic_interleaved_order propagation through the full replay chain is solid.
Spurious files: tools/approval.py, tui_gateway/server.py, context_compressor.py and tests were accidentally included (same root cause as fix(security): strip shell escapes in command denylist; fail-closed on missing approval module (#36846 #36847) #40370 — branch was off local HEAD not upstream/main). Reset branch to upstream/main and cherry-picked only the thinking-interleaved commit. PR now contains only the 4 intended files: agent/anthropic_adapter.py, agent/chat_completion_helpers.py, agent/transports/anthropic.py, agent/transports/types.py.
Missing init init: moot now that the spurious context_compressor.py changes are removed. upstream/main already carries self.awaiting_real_usage_after_compression = False in init so no gap exists in the cleaned PR.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
comp/agentCore agent loop, run_agent.py, prompt builderP1High — major feature broken, no workaroundprovider/anthropicAnthropic native Messages APItype/bugSomething isn't working
3 participants
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Files changed
Test plan
Fixes #35975.