Skip to content

fix(compressor): keep truncated tool_call arguments as valid JSON#11617

Closed
handsdiff wants to merge 1 commit into
NousResearch:mainfrom
handsdiff:fix/compressor-tool-args-valid-json
Closed

fix(compressor): keep truncated tool_call arguments as valid JSON#11617
handsdiff wants to merge 1 commit into
NousResearch:mainfrom
handsdiff:fix/compressor-tool-args-valid-json

Conversation

@handsdiff

Copy link
Copy Markdown
Contributor

Summary

agent/context_compressor.py Pass 3 of _prune_old_tool_results truncates large tool_call arguments strings by byte-slicing to 200 chars and appending the literal sentinel ...[truncated]. The result is invalid JSON — mid-value, unterminated string.

Some providers (notably Anthropic via LiteLLM) re-parse function.arguments when rebuilding tool_use blocks. Once a session is compressed with any tool_call whose original args exceeded 500 chars (routine for tools like patch), the next API call fails with:

HTTP 400: litellm.BadRequestError: AnthropicException -
Failed to parse tool call arguments for tool 'patch' (Anthropic tool invoke).
Error: Unterminated string starting at: line 1 column 35 (char 34).

All 3 retries send the same corrupted messages array so they fail identically, and every subsequent user turn replays the poisoned history. The session is permanently stuck and the user sees only a canned error fallback with no way to recover (the corruption is in server-side state).

Reproduction

Any session that compresses while carrying a >500-char assistant tool_call will produce invalid-JSON arguments. On the next turn, the provider adapter (e.g. litellm's Anthropic translator) cannot re-parse the tool_call and rejects the request.

Fix

Replace the byte slice with a compact JSON-sentinel object that preserves provenance (original length + 200-char preview) while staying parseable:

{"_truncated": true, "_original_length": 1247, "_preview": "..."}

Test plan

  • New unit test test_pruned_tool_call_args_remain_valid_json asserting pruned arguments parse via json.loads and carry the sentinel shape
  • All existing TestTokenBudgetTailProtection tests still pass

Observed in production

Caught after a user's agent became unresponsive. A session had 11 prior patch tool_calls poisoned by this truncation after compression. Every subsequent inbound message triggered a new 3-retry failure cycle — user received 3 cryptic LiteLLM 400 error strings back-to-back.

@handsdiff handsdiff force-pushed the fix/compressor-tool-args-valid-json branch from cdeaa3c to 1c7cd8b Compare April 17, 2026 14:20
@handsdiff handsdiff force-pushed the fix/compressor-tool-args-valid-json branch from 1c7cd8b to 45219d9 Compare April 18, 2026 16:12
When _prune_old_tool_results truncates a large tool_call arguments
string, it was byte-slicing to 200 chars and appending the literal
sentinel "...[truncated]". The resulting string is not valid JSON
(unterminated mid-value), and some providers — notably Anthropic via
LiteLLM — re-parse function.arguments when rebuilding tool_use
blocks. The next API call fails with HTTP 400 ("Unterminated string
starting at: line 1 column 35"), and every subsequent turn replays
the same corrupted history, so the session is permanently stuck.

Observed in the wild: a session that compressed with 11 prior patch
tool_calls (each >500 chars) then returned HTTP 400 on every inbound
message. The 3 retries are identical payloads, so they all fail the
same way. The user sees only a canned error fallback.

Replace the byte slice with a compact JSON sentinel that preserves
provenance (original length + 200-char preview) while staying
parseable.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@teknium1

Copy link
Copy Markdown
Contributor

Closing as duplicate of PR #12259 (merged). You independently diagnosed the same bug on Apr 17 within hours of #11788 and #11821 — thanks for the solid root-cause analysis and the smallest-diff approach. We went with #11788's structural-shrink approach because it preserves arg shape (paths, ints, lists stay intact) instead of replacing with a sentinel object, but your diagnosis was spot-on. Credited in the salvage PR body.

@handsdiff

Copy link
Copy Markdown
Contributor Author

Confirming this is superseded by upstream commit 3128d9fc ("fix(context_compressor): keep tool-call arguments JSON valid when shrinking").

Upstream's fix introduces a _truncate_tool_call_args_json helper that shrinks long string values inside the JSON blob while preserving structure and keys — strictly better than this PR's sentinel-object replacement. Rebasing fix/compressor-tool-args-valid-json on current upstream/main leaves zero commits ahead. Updating our fork accordingly.

handsdiff added a commit to handsdiff/hermes-agent that referenced this pull request Apr 24, 2026
…branch-off-upstream

- Adds NousResearch#11617 (compressor fix) and the two pending MCP PRs to the PR table.
- Documents the root cause that bit three PR branches: cutting feature
  branches from fork main (an octopus merge) instead of upstream/main.
- Extends the rebase workflow to the three new branches.
- Documents the integrations tool as a fork-only change (exe.dev-specific).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
handsdiff added a commit to handsdiff/hermes-agent that referenced this pull request Apr 24, 2026
…flow

PR NousResearch#11617 (compressor tool-call args valid JSON) was closed Apr 18;
upstream commit 3128d9f landed a strictly-better fix via
_truncate_tool_call_args_json, which shrinks string values inside the
JSON blob instead of replacing the whole payload with a sentinel.
Rebasing fix/compressor-tool-args-valid-json on upstream/main leaves
zero commits ahead.

Move to the closed/superseded section, remove from the rebase loop
and merge sequence so future rebuilds don't try to merge an empty
branch.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants