Skip to content

fix(agents): skip error/aborted assistant messages in transcript repair#4844

Closed
lailoo wants to merge 1 commit intoopenclaw:mainfrom
lailoo:fix/session-transcript-repair-error-handling
Closed

fix(agents): skip error/aborted assistant messages in transcript repair#4844
lailoo wants to merge 1 commit intoopenclaw:mainfrom
lailoo:fix/session-transcript-repair-error-handling

Conversation

@lailoo
Copy link

@lailoo lailoo commented Jan 30, 2026

Summary

When an assistant response is interrupted mid-tool-call (stopReason: error or aborted), repairToolUseResultPairing() was still extracting tool calls and inserting synthetic tool_result entries. This caused orphan tool_result blocks after transformMessages() dropped the errored assistant, leading to permanent session corruption with unexpected tool_use_id API errors.

Changes

  • Modified repairToolUseResultPairing() in src/agents/session-transcript-repair.ts to skip tool call extraction for stopReason === "error" || "aborted" assistant messages
  • Added 4 new unit tests covering error/aborted scenarios

Root Cause

Two safety mechanisms interacted badly:

  1. session-tool-result-guard inserts synthetic toolResult for orphaned tool calls
  2. transformMessages() drops error/aborted assistant messages but not their corresponding toolResult

This created orphaned tool_result entries that referenced non-existent tool_use blocks.

Testing

  • All 8 tests in session-transcript-repair.test.ts pass
  • Related tests in session-tool-result-guard.test.ts and pi-embedded-runner.guard.test.ts pass

Fixes

Greptile Overview

Greptile Summary

This PR updates repairToolUseResultPairing() to avoid extracting tool calls (and thus avoid inserting synthetic toolResult entries) from assistant messages with stopReason: "error" or "aborted". This prevents creating orphan toolResult blocks later when other sanitization (transformMessages()) drops those assistant turns but leaves tool results behind.

Unit tests were added to ensure error/aborted assistant messages do not trigger synthetic toolResult insertion, while normal tool-using assistants are still repaired as before.

Confidence Score: 4/5

  • This PR is likely safe to merge and directly addresses a real session-corruption failure mode.
  • The change is small, localized, and covered by new unit tests for the targeted stopReason cases. Main remaining uncertainty is behavioral: skipping errored/aborted assistants could drop legitimate tool results that followed those turns, which may or may not be desired depending on how transcripts are recorded in practice.
  • src/agents/session-transcript-repair.ts

When an assistant response is interrupted mid-tool-call (stopReason: error
or aborted), repairToolUseResultPairing() was still extracting tool calls
and inserting synthetic tool_result entries. This caused orphan tool_result
blocks after transformMessages() dropped the errored assistant, leading to
permanent session corruption with 'unexpected tool_use_id' API errors.

This fix skips tool call extraction for error/aborted assistant messages,
preventing the creation of orphan tool_result entries.

Fixes openclaw#4600, openclaw#4597, openclaw#4475, openclaw#4814, openclaw#4815
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 files reviewed, 2 comments

Edit Code Review Agent Settings | Greptile

Comment on lines +120 to +123
// Skip error/aborted assistant messages - do not extract tool calls from them
// as they may contain incomplete/invalid tool_use blocks that would cause
// orphan tool_result entries after transformMessages() drops the errored assistant
const stopReason = (assistant as { stopReason?: string }).stopReason;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[P1] Skipping errored/aborted assistants can leave their existing tool results as orphans.

This change avoids creating synthetic toolResults for stopReason === "error" | "aborted", but if the transcript already contains real tool results for those tool calls (e.g., assistant streamed toolCall then later got marked error, yet the tool executed and logged a toolResult), the next loop iterations will hit the role !== "assistant" branch and drop those toolResults as “orphan”. That can discard valid tool outputs and potentially change downstream behavior (e.g., losing tool error context).

If that scenario is expected, it likely needs a test and/or a more nuanced rule (e.g., treat errored assistants with complete tool calls differently, or only skip when the assistant content is known-incomplete).

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/session-transcript-repair.ts
Line: 120:123

Comment:
[P1] Skipping errored/aborted assistants can leave their *existing* tool results as orphans.

This change avoids creating synthetic `toolResult`s for `stopReason === "error" | "aborted"`, but if the transcript already contains real tool results for those tool calls (e.g., assistant streamed `toolCall` then later got marked `error`, yet the tool executed and logged a `toolResult`), the next loop iterations will hit the `role !== "assistant"` branch and drop those `toolResult`s as “orphan”. That can discard valid tool outputs and potentially change downstream behavior (e.g., losing tool error context).

If that scenario is expected, it likely needs a test and/or a more nuanced rule (e.g., treat errored assistants with complete tool calls differently, or only skip when the assistant content is known-incomplete).

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +113 to +118
it("skips tool call extraction for error assistant messages", () => {
const input = [
{
role: "assistant",
stopReason: "error",
errorMessage: "Request terminated",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[P2] New tests only assert “no synthetic toolResult inserted”, but don’t cover the case where a real toolResult already exists.

Given the production bug involves interactions with transformMessages() dropping errored assistants, it would be useful to add a test where an assistant(stopReason:error) with a tool call is followed by an actual matching toolResult entry, and verify the intended behavior (drop it? keep it? move it?). Without that, the behavior change mentioned in session-transcript-repair.ts can regress silently.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/session-transcript-repair.test.ts
Line: 113:118

Comment:
[P2] New tests only assert “no synthetic toolResult inserted”, but don’t cover the case where a *real* toolResult already exists.

Given the production bug involves interactions with `transformMessages()` dropping errored assistants, it would be useful to add a test where an `assistant(stopReason:error)` with a tool call is followed by an actual matching `toolResult` entry, and verify the intended behavior (drop it? keep it? move it?). Without that, the behavior change mentioned in `session-transcript-repair.ts` can regress silently.

How can I resolve this? If you propose a fix, please make it concise.

@Glucksberg
Copy link
Contributor

👋 This appears to duplicate PR #4476, which implements the same fix. Consider consolidating or closing one to avoid duplicate effort.

@steipete
Copy link
Contributor

AI-assisted stale triage closure (2026-02-24).

Closing this PR because the fix is already in main.

Why:

This is AI-closed housekeeping, not a rejection of the issue.

If you still see orphan tool-result behavior after error/abort paths, open a fresh focused PR from latest main with a repro.

@steipete steipete closed this Feb 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants