fix(compression): sanitize malformed tool messages + auto-recover on API 400#18650
Closed
franksong2702 wants to merge 2 commits into
Closed
fix(compression): sanitize malformed tool messages + auto-recover on API 400#18650franksong2702 wants to merge 2 commits into
franksong2702 wants to merge 2 commits into
Conversation
…all_id Sanitizer handled orphaned tool results (no matching tool_call) and missing tool results (tool_call without result), but tool messages with tool_call_id=None, '', or missing entirely slipped through both checks and reached the API, causing HTTP 400 'tool_call_id is not set' errors. This manifests on providers like MiMo after multi-pass compression on long sessions (155+ messages), where message copying or provider formatting can produce tool messages without a valid tool_call_id. Adds Step 0 to _sanitize_tool_pairs: filter out tool-role messages where tool_call_id is falsy before the existing orphan checks run. Includes 3 new unit tests covering None, empty string, and missing field cases.
Second half of the compression corruption fix. When the API rejects a request with 400 containing 'tool_call_id' or 'function call output' patterns, the error classifier now recognizes this as tool_message_malformed (retryable + should_sanitize_tools) instead of a generic non-retryable format_error. The retry loop runs _sanitize_tool_messages_for_retry() which: 1. Removes tool messages with missing/null/empty tool_call_id 2. Removes orphaned tool results (valid id but no matching tool_call) 3. Retries the API call with cleaned messages This recovers sessions that got permanently stuck after a bad compression pass produced malformed tool messages. Previously such sessions required manual /new — now they self-heal.
Cyrene963
pushed a commit
to Cyrene963/hermes-agent
that referenced
this pull request
May 3, 2026
Community PRs applied: - NousResearch#18596: Enable secret redaction by default (SECURITY) - NousResearch#18650: Sanitize malformed tool messages + auto-recover on API 400 - NousResearch#18607: Emergency compression before max_iterations exhaustion - NousResearch#18603: Compression fallback to main model on 413 rate limit - NousResearch#18638: Pass threshold_percent on model switch - NousResearch#18663: Strip extra_content from tool_calls for strict APIs - NousResearch#18618: Forward explicit_api_key to OpenRouter - NousResearch#18632: Show cache tokens in /insights breakdown - NousResearch#18614: Add idempotency guard for patch duplicate loops - NousResearch#18600: Raise ValueError when HERMES_HOME unset in profile mode - NousResearch#18616: Allow ZWJ emoji in context files - NousResearch#18582: Reload .env on /restart - NousResearch#18547: Stabilize system prompt prefix for KV cache reuse - NousResearch#18692: Strip FTS5 operators from session search truncation terms Fix: Add order_by_last_active=True to list_sessions_rich call (pre-existing commit 142b4bf code sync)
1 task
Contributor
|
This looks implemented on current Evidence:
The exact enum/ |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Multi-pass context compression on long sessions (155+ messages) can produce tool-role messages with
tool_call_idset toNone, empty string, or missing entirely. The existing_sanitize_tool_pairsmethod handles two failure modes but misses this third one — these malformed messages slip through and reach the API, causing HTTP 400:Once a session gets corrupted this way, it's permanently stuck — every subsequent API call fails with the same 400, and since the error classifier treats it as a generic
format_error(non-retryable), it falls back to the same broken messages. The user seesPrimary model failed — switching to fallbackrepeatedly until they manually run/new.Reproduction: Long WeChat session (155 msgs, ~33K tokens) → 5+ compression passes → MiMo API rejects every subsequent request with 400. Workaround is
/newto start a fresh session.Fixes #16472 — Tool call ID not invalidated after 400 error (same permanent-failure loop from malformed tool messages).
Fixes #4662 — Malformed persisted tool calls poisoning sessions (our Part 1 prevents corruption at compression time, Part 2 auto-recovers when corruption is detected via API 400).
Also mitigates:
Root Cause
Two-part chain:
Part 1 — Compression produces malformed messages:
_sanitize_tool_pairsonly collectstool_call_idvalues that are truthy. Tool messages with falsytool_call_idnever enter the orphan-detection set, so the filter never matches them.Part 2 — Error classifier doesn't recognize tool format errors:
The
_classify_400function only triggers compression forcontext_length_exceededpatterns. A 400 liketool_call_id is not setfalls through toformat_error→ non-retryable → fallback → same broken messages → still 400 → permanent loop.Fix
Part 1: Prevent corruption (context_compressor.py)
Added Step 0 to
_sanitize_tool_pairs: filter out tool-role messages wheretool_call_idis falsy before the existing orphan checks run.Part 2: Auto-recover from corruption (error_classifier.py + run_agent.py)
FailoverReason.tool_message_malformedenum valueshould_sanitize_toolsflag onClassifiedError_TOOL_CALL_MALFORMED_PATTERNSmatchingtool_call_id,function call output, etc._sanitize_tool_messages_for_retry()method on AIAgent that removes malformed/orphaned tool messages in-placeclassified.should_sanitize_toolsand auto-recovers instead of falling backTests
6 new tests across 2 test files:
test_sanitizer_removes_tool_messages_with_none_tool_call_idtest_sanitizer_removes_tool_messages_with_empty_tool_call_idtest_sanitizer_removes_all_orphaned_when_no_assistant_callstest_400_tool_call_id_malformed(error classifier)test_400_no_tool_call_found_for_output(error classifier)test_enum_members_exist(updated for new enum value)All 125+ tests pass.