Skip to content

fix(agent): preserve interleaved thinking+tool_use order to prevent HTTP 400 crash-loop#40373

Closed
ashishpatel26 wants to merge 0 commit into
NousResearch:mainfrom
ashishpatel26:fix/thinking-interleaved-order-35975
Closed

fix(agent): preserve interleaved thinking+tool_use order to prevent HTTP 400 crash-loop#40373
ashishpatel26 wants to merge 0 commit into
NousResearch:mainfrom
ashishpatel26:fix/thinking-interleaved-order-35975

Conversation

@ashishpatel26

@ashishpatel26 ashishpatel26 commented Jun 6, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Extended-thinking Claude models (Opus 4.8+) sign each thinking block against its original position in the response. When Hermes separated blocks into reasoning_details and tool_calls and replayed them as [all thinking][text][all tool_use], the positions no longer matched the signatures, causing a non-retryable HTTP 400 on every subsequent turn -- an unrecoverable crash-loop.
  • normalize_response() in agent/transports/anthropic.py now records the original interleaved block order in provider_data['_anthropic_interleaved_order'] (only when there are multiple thinking blocks alongside tool_use blocks, i.e. when reordering would actually invalidate signatures).
  • _convert_assistant_message() in agent/anthropic_adapter.py uses this order when present to reconstruct the exact original sequence (e.g. [thinking, tool_use, thinking, tool_use, ...]), with a safe fallback to the original behaviour for messages that lack the field.

Files changed

  • agent/transports/anthropic.py -- track interleaved order during normalize_response()
  • agent/transports/types.py -- expose anthropic_interleaved_order property on NormalizedResponse
  • agent/chat_completion_helpers.py -- propagate _anthropic_interleaved_order into the stored message dict
  • agent/anthropic_adapter.py -- use interleaved order in _convert_assistant_message() to reconstruct faithful block sequence

Test plan

  • Multi-turn conversation with an extended-thinking Anthropic model (Opus 4.8+) that produces both thinking blocks and tool_use blocks in the same response -- subsequent turns should not receive HTTP 400
  • Single thinking block with no tool calls -- falls back to original path, no regression
  • Tool calls with no thinking blocks -- falls back to original path, no regression
  • Kimi/DeepSeek reasoning_content injection path -- guard condition (_already_has_thinking) still prevents duplicate thinking blocks

Fixes #35975.

@liuhao1024

Copy link
Copy Markdown
Contributor

Review: Missing attribute initialization — awaiting_real_usage_after_compression

The should_defer_preflight_to_real_usage() guard reads self.awaiting_real_usage_after_compression (line ~717), but this attribute is only assigned in compress() (line ~2087). It is never initialized in __init__.

On a fresh agent session where should_defer_preflight_to_real_usage() is called before the first compress(), this will raise:

AttributeError: 'ContextCompressor' object has no attribute 'awaiting_real_usage_after_compression'

The tests don't catch this because they explicitly set compressor.awaiting_real_usage_after_compression = True before the assertion.

Fix: add to __init__:

self.awaiting_real_usage_after_compression = False

The interleaved thinking+tool_use preservation logic (anthropic adapter, transport, types) looks correct and well-tested. The _anthropic_interleaved_order propagation through the full replay chain is solid.

@alt-glitch alt-glitch added type/bug Something isn't working comp/agent Core agent loop, run_agent.py, prompt builder provider/anthropic Anthropic native Messages API P1 High — major feature broken, no workaround labels Jun 6, 2026
@ashishpatel26 ashishpatel26 force-pushed the fix/thinking-interleaved-order-35975 branch from 12c9cfb to c310089 Compare June 6, 2026 10:22
@ashishpatel26

Copy link
Copy Markdown
Contributor Author

Thanks @liuhao1024 — good catches on both points.

  1. Spurious files: tools/approval.py, tui_gateway/server.py, context_compressor.py and tests were accidentally included (same root cause as fix(security): strip shell escapes in command denylist; fail-closed on missing approval module (#36846 #36847) #40370 — branch was off local HEAD not upstream/main). Reset branch to upstream/main and cherry-picked only the thinking-interleaved commit. PR now contains only the 4 intended files: agent/anthropic_adapter.py, agent/chat_completion_helpers.py, agent/transports/anthropic.py, agent/transports/types.py.

  2. Missing init init: moot now that the spurious context_compressor.py changes are removed. upstream/main already carries self.awaiting_real_usage_after_compression = False in init so no gap exists in the cleaned PR.

Force-pushed.

@ashishpatel26 ashishpatel26 force-pushed the fix/thinking-interleaved-order-35975 branch from c310089 to 6b40736 Compare June 12, 2026 18:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder P1 High — major feature broken, no workaround provider/anthropic Anthropic native Messages API type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: extended-thinking + interleaved multi-thinking tool turn → non-retryable HTTP 400 crash-loop (reordered thinking signature)

3 participants