fix: strip incomplete tool_use blocks from errored/aborted assistants#11430
fix: strip incomplete tool_use blocks from errored/aborted assistants#11430charlesnchr wants to merge 2 commits intoopenclaw:mainfrom
Conversation
| if (stopReason === "error" || stopReason === "aborted") { | ||
| out.push(msg); | ||
| const content = Array.isArray(assistant.content) ? assistant.content : []; | ||
| const stripped = content.filter((block) => { | ||
| if (!block || typeof block !== "object") { | ||
| return true; | ||
| } | ||
| const type = (block as { type?: unknown }).type; | ||
| return typeof type !== "string" || !TOOL_CALL_TYPES.has(type); | ||
| }); | ||
| if (stripped.length !== content.length) { | ||
| changed = true; | ||
| out.push({ ...assistant, content: stripped } as typeof msg); | ||
| } else { | ||
| out.push(msg); |
There was a problem hiding this comment.
Stops dropping trailing tool_results
When stopReason is "error"/"aborted", this branch continues after pushing the (possibly stripped) assistant message, so any subsequent toolResult entries (including ones that match stripped tool calls) are processed later by the top-level role !== "assistant" branch and get dropped as “orphans”. This changes behavior vs the previous continue path (which also dropped them) only if the intent was to keep matching toolResults that follow an errored/aborted assistant; currently they will always be removed.
If the desired behavior is to preserve a tool result that actually completed before abort/error, this branch needs to consume following toolResults (similar to the normal span scan) rather than leaving them to be dropped globally.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/session-transcript-repair.ts
Line: 223:236
Comment:
**Stops dropping trailing tool_results**
When `stopReason` is `"error"`/`"aborted"`, this branch `continue`s after pushing the (possibly stripped) assistant message, so any subsequent `toolResult` entries (including ones that match stripped tool calls) are processed later by the top-level `role !== "assistant"` branch and get dropped as “orphans”. This changes behavior vs the previous `continue` path (which also dropped them) only if the intent was to keep *matching* `toolResult`s that follow an errored/aborted assistant; currently they will always be removed.
If the desired behavior is to preserve a tool result that actually completed before abort/error, this branch needs to consume following `toolResult`s (similar to the normal span scan) rather than leaving them to be dropped globally.
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
Correct observation. Dropping the orphan tool_results is intentional here — since the tool_use blocks are stripped from the assistant, there is nothing for those tool_results to reference. Keeping them would cause the same API validation error.
Added a comment in the updated push to make this explicit: "Any following tool_results become orphans and are dropped by the top-level orphan check above."
Also simplified the implementation to reuse the existing isToolCallBlock() helper and consolidated the tests.
7719df9 to
4cda587
Compare
…messages When an assistant message has stopReason 'error' or 'aborted', its tool_use blocks may be incomplete (partialJson). Previously, the repair logic skipped these messages entirely, leaving the tool_use blocks in place without matching tool_results. This caused the Anthropic API to reject the transcript with 'unexpected tool_use_id found in tool_result blocks', deadlocking the session permanently since the error persisted across restarts. Now we strip the tool_use/toolCall content blocks from errored/aborted assistant messages instead of skipping them, preserving any text content while removing the source of the API validation error.
4cda587 to
385d015
Compare
Summary
repairToolUseResultPairing()skipped errored/aborted assistant messages without stripping theirtool_useblocks. This left unmatched tool calls in the transcript that the API permanently rejected. The user-facing symptom was the bot replying with:The underlying API error:
The error persisted across restarts (session file never repaired on disk), deadlocking the session.
Root Cause
The skip added in #4597 correctly avoided creating synthetic
tool_results for incomplete tool calls, but left the incompletetool_useblocks in the assistant message. Now we strip them instead, reusing the existingisToolCallBlock()helper. Non-tool content is preserved; following orphantool_results are still dropped.Tests
lobster-biscuit