Skip to content

Compaction produces invalid tool_use/tool_result ordering → silent fallback to wrong model #63608

@VeePeeTech

Description

@VeePeeTech

Summary

After session compaction, the replayed conversation history sent to the Anthropic API contains tool_use blocks without matching tool_result blocks. Anthropic rejects this with 400 invalid_request_error. The gateway then silently falls back to the next model in the fallback chain (e.g. chatgpt/gpt-5.4) and never automatically recovers to the primary model.

This causes the session to silently run on the wrong model indefinitely until manually corrected via session_status(model=...) or /model.

Environment

  • OpenClaw version: 2026.4.9 (0512059)
  • OS: macOS 26.3.1 (arm64)
  • Primary model: anthropic/claude-opus-4-6
  • Fallbacks: chatgpt/gpt-5.4, anthropic/claude-opus-4-20250514, venice/claude-opus-4-6, venice/gpt-5.4, venice/claude-sonnet-4-6
  • Affected session: agent:main:telegram:group:-1003753641666 (session ID 59f592d3-6901-43e3-ab07-3bf93f75bae3)
  • Compactions in session: 31+

Exact Error

[agent] embedded run agent end: isError=true model=claude-opus-4-6 provider=anthropic
error=LLM request rejected: messages.17: `tool_use` ids were found without `tool_result`
blocks immediately after: call8ouuvbBmWSXgA8nv98RDlHeQ. Each `tool_use` block must have
a corresponding `tool_result` block in the next message.
rawError=400 {"type":"error","error":{"type":"invalid_request_error",...}}

Evidence from gateway.err.log (2026-04-08)

Multiple occurrences throughout the evening, each triggering the same pattern:

Time (PDT) tool_use ID Message index Fallback target
20:13:35 call8ouuvbBmWSXgA8nv98RDlHeQ messages.17 chatgpt/gpt-5.4
20:30:15 call8ouuvbBmWSXgA8nv98RDlHeQ messages.11 chatgpt/gpt-5.4
20:36:20 call8ouuvbBmWSXgA8nv98RDlHeQ messages.11 chatgpt/gpt-5.4
20:41:45 callPAoAtqCzFifk2JSOMQPmDIcN messages.3 chatgpt/gpt-5.4
23:11:08 callIXN6A0bDrDpHjAfWDECc0xY2 messages.7 chatgpt/gpt-5.4
23:31:11 callH6nRwHU5OWsAzGuhjddMuOoz messages.13 chatgpt/gpt-5.4

Note the same tool_use ID (call8ouuvbBmWSXgA8nv98RDlHeQ) appears in the first three errors at different message indices, suggesting the orphaned tool_use persists across compactions and shows up at different positions depending on what else was compacted.

Sequence of Events

  1. Session runs normally on anthropic/claude-opus-4-6
  2. Context grows → auto-compaction triggers
  3. Compaction summary is generated (often by chatgpt/gpt-5.4 per gateway.log: auto-compaction succeeded for chatgpt/gpt-5.4)
  4. Compacted conversation replayed to Anthropic API
  5. Replay contains orphaned tool_use without matching tool_result
  6. Anthropic returns 400 invalid_request_error
  7. Gateway logs [model-fallback] decision=candidate_failed reason=overloaded (note: mislabeled as "overloaded" — it is actually an invalid request)
  8. Falls back to chatgpt/gpt-5.4 which succeeds
  9. Session stays on fallback model permanently — no recovery mechanism

Two bugs

Bug 1: Compaction produces invalid conversation history

The compacted/summarized conversation is missing tool_result blocks for some tool_use calls. Anthropic strictly requires every tool_use to have a corresponding tool_result in the next message.

Bug 2: No automatic recovery to primary model

After a fallback succeeds, the session remains on the fallback model. There is no mechanism to retry the primary model on subsequent messages. The user/agent must manually reset via session_status(model=...).

Bug 3 (minor): Incorrect fallback reason

The fallback log labels the reason as reason=overloaded when the actual error is a 400 invalid_request_error (malformed conversation). This makes debugging harder.

Expected Behavior

  1. Compacted conversation should always produce valid tool_use/tool_result pairs, or strip orphaned tool_use blocks during compaction
  2. After a temporary fallback, the gateway should attempt the primary model again on the next user message
  3. Fallback reason should accurately reflect the error type (e.g. reason=invalid_request not reason=overloaded)

Workaround

Manually run session_status(model="anthropic/claude-opus-4-6") after each detected drift, or start a new session with /new. Neither is automatic.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions