Truncated tool_call.arguments in conversation history wedges the retry loop
Summary
If a streamed assistant response is cut off mid input_json_delta, the resulting tool_call.arguments is saved to conversation history as a truncated (non-JSON) string. Every subsequent API request re-sends that history, and Anthropic-compatible proxies (LiteLLM in particular) reject it with a 400 Failed to parse tool call arguments. The retry loop cannot recover because the bad message is deterministically pinned in history — all retries and fallback fail the same way.
Reproduction
- Be running through a LiteLLM proxy to an Anthropic-family model.
- Have the upstream connection drop (transient network, proxy timeout, etc.) while the model is mid
input_json_delta for a tool call.
- Observe the truncated
tool_call.arguments land in messages for the current session.
- Any next turn produces:
⚠️ API call failed (attempt 1/3): BadRequestError [HTTP 400]
Error: HTTP 400: litellm.BadRequestError: AnthropicException -
Failed to parse tool call arguments for tool 'patch' (Anthropic tool invoke).
Error: Unterminated string starting at: line 1 column 76 (char 75).
Arguments: {"path": "~/...file.py", "old_string": " # Anthropic think
All 3 retries fail identically. Fallback to the direct Anthropic endpoint would fail the same way for the same reason (it's the outbound history that's malformed, not the provider).
Root cause
The Anthropic adapter (agent/anthropic_adapter.py:1133) guards this path:
try:
parsed_args = json.loads(args) if isinstance(args, str) else args
except (json.JSONDecodeError, ValueError):
parsed_args = {}
But the Anthropic transport delegates to the adapter only after its own convert_messages, and the chat-completions transport (agent/transports/chat_completions.py) — which is what drives LiteLLM/OpenRouter/etc. — has no guard at all. It forwards tool_calls[*].function.arguments through as-is, and LiteLLM's internal OpenAI → Anthropic converter raises when it hits malformed JSON.
Proposed fix
Sanitize at the outbound boundary in both transports' convert_messages. Drop (do not repair) malformed tool_calls before they leave the process, plus strip orphan tool results whose tool_call_id no longer has a matching assistant tool_call. Dropping (rather than repairing to {}) is intentional — a {}-arg patch call would execute a garbage edit; dropping it lets the model re-attempt on the next turn.
A PR implementing this is attached.
Why transport-level and not only adapter-level
The adapter-level guard only protects the direct-Anthropic path. Proxies that speak OpenAI chat-completions to us (LiteLLM, OpenRouter with Anthropic routing) never hit the adapter — they hit ChatCompletionsTransport and the malformed arguments pass straight through to a provider that will reject them. Putting the guard in the transport ensures coverage regardless of downstream provider shape.
Environment
- Hermes Agent main branch (commit
fa8f0c6f, v0.8.0+)
- LiteLLM proxy → Anthropic (observed); direct Anthropic would hit the same path via transport-level convert
- Python 3.11.15
Truncated tool_call.arguments in conversation history wedges the retry loop
Summary
If a streamed assistant response is cut off mid
input_json_delta, the resultingtool_call.argumentsis saved to conversation history as a truncated (non-JSON) string. Every subsequent API request re-sends that history, and Anthropic-compatible proxies (LiteLLM in particular) reject it with a400 Failed to parse tool call arguments. The retry loop cannot recover because the bad message is deterministically pinned in history — all retries and fallback fail the same way.Reproduction
input_json_deltafor a tool call.tool_call.argumentsland inmessagesfor the current session.All 3 retries fail identically. Fallback to the direct Anthropic endpoint would fail the same way for the same reason (it's the outbound history that's malformed, not the provider).
Root cause
The Anthropic adapter (
agent/anthropic_adapter.py:1133) guards this path:But the Anthropic transport delegates to the adapter only after its own
convert_messages, and the chat-completions transport (agent/transports/chat_completions.py) — which is what drives LiteLLM/OpenRouter/etc. — has no guard at all. It forwardstool_calls[*].function.argumentsthrough as-is, and LiteLLM's internal OpenAI → Anthropic converter raises when it hits malformed JSON.Proposed fix
Sanitize at the outbound boundary in both transports'
convert_messages. Drop (do not repair) malformedtool_callsbefore they leave the process, plus strip orphan tool results whosetool_call_idno longer has a matching assistant tool_call. Dropping (rather than repairing to{}) is intentional — a{}-argpatchcall would execute a garbage edit; dropping it lets the model re-attempt on the next turn.A PR implementing this is attached.
Why transport-level and not only adapter-level
The adapter-level guard only protects the direct-Anthropic path. Proxies that speak OpenAI chat-completions to us (LiteLLM, OpenRouter with Anthropic routing) never hit the adapter — they hit
ChatCompletionsTransportand the malformed arguments pass straight through to a provider that will reject them. Putting the guard in the transport ensures coverage regardless of downstream provider shape.Environment
fa8f0c6f, v0.8.0+)