fix(session): strip malformed tool_use blocks to prevent session corruption#5557
fix(session): strip malformed tool_use blocks to prevent session corruption#5557NSEvent wants to merge 2 commits intoopenclaw:mainfrom
Conversation
|
Why this check is failing openclaw@2026.1.30 format /home/runner/_work/openclaw/openclaw
Checking formatting... docs/automation/cron-jobs.md (94ms) Format issues found in above 1 files. Run without |
a58932e to
0e2ac92
Compare
…uption When tool calls are interrupted (by error, timeout, content filtering, or process termination), sessions can become permanently corrupted. Every subsequent API request fails with errors like: - "unexpected tool_use_id found in tool_result blocks" - "tool result's tool id not found (2013)" Root cause: extractToolCallsFromAssistant() skips malformed tool_use blocks but leaves them in the message content. The blocks remain in the transcript causing API rejections. Fix: Strip malformed tool_use blocks (missing id, missing name, or with partialJson field) BEFORE the pairing repair runs. This prevents creating synthetic results for invalid blocks and allows sessions to auto-recover. Fixes openclaw#5497, openclaw#5481, openclaw#5430, openclaw#5518 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
… assistant messages
fc3255e to
7d53df6
Compare
Not sure how that got changed, fixed now |
|
Closing to resubmit with improvements addressing edge-case coverage. |
|
🔍 Overlapping PRs Detected This PR appears to overlap with 22 other open PRs all addressing tool call/tool_result pairing and sanitization issues:
Similarity scores computed using Voyage AI embeddings (cosine similarity) on standardized PR summaries. |
Summary
partialJsonfield) are now detected and removed before pairing repair runsProblem
When tool calls are interrupted (by error, timeout, content filtering, or process termination), sessions become permanently corrupted. Every subsequent API request fails with:
unexpected tool_use_id found in tool_result blockstool result's tool id not found (2013)Root cause:
extractToolCallsFromAssistant()skips malformed tool_use blocks (lines 22-23 in the old code) but leaves them in the message content. The blocks remain in the transcript causing API rejections.Solution
isValidToolUseBlock()to detect malformed blocksstripMalformedToolUseBlocks()to remove them from assistant messagesdroppedMalformedToolUseCountin the repair report for observabilityTest plan
Future suggestions
These are not blockers but could improve robustness:
Consolidate validation logic -
extractToolCallsFromAssistant()has similar but not identical validation. Could extract shared validation to reduce duplication.Provider-specific partial data fields - Currently only checks for
partialJson. Other providers might use different field names for partial/streaming data.Consider logging in
stripMalformedToolUseBlocksdirectly - Currently only logs in google.ts. Other call sites viasanitizeToolUseResultPairing()don't get the warning.Fixes #5497, #5481, #5430, #5518
🤖 Generated with Claude Code
Greptile Overview
Greptile Summary
This PR improves session transcript repair by stripping malformed assistant
toolCallblocks (e.g., missing/emptyidor containingpartialJson) before attempting to re-pairtoolResultmessages. The repair report now exposesdroppedMalformedToolUseCount, and the Google embedded runner logs a warning when malformed tool blocks are removed. A new test suite reproduces the “interrupted tool call causes permanent session corruption” scenario and validates the new behavior.Confidence Score: 4/5
partialJson/missingid).