fix(agents): detect partial tool execution before fallback retry#43507
fix(agents): detect partial tool execution before fallback retry#43507mitchmcalister wants to merge 2 commits intoopenclaw:mainfrom
Conversation
…pt replay When a model fallback retry fires after partial execution (tool calls completed before timeout/rate-limit), warn the fallback model about already-executed actions to prevent replaying non-idempotent operations like sending messages or running shell commands. - Add partialExecution field to FailoverError with sanitized tool names - Thread partial execution context through model-fallback → agent.ts - Prepend [System: ...] context block to retry prompt listing completed tools and warning against re-sending messages - Log partial execution metadata in fallback decision observations Closes openclaw#43482 Related: openclaw#43331 (security review that flagged this gap) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
🔒 Aisle Security AnalysisWe found 1 potential security issue(s) in this PR:
1. 🔵 Tool execution metadata disclosure to fallback LLM providers via injected system prompt
DescriptionThe fallback retry prompt now includes tool execution metadata (tool names + whether a messaging tool already sent messages) and injects it into the next model attempt’s prompt. Data flow introduced by this diff:
Why this is a problem:
Vulnerable code (prompt construction): const toolList = params.previousPartialExecution.toolNames.join(", ");
...
`It completed these tool calls: ${toolList}.${messagingWarning} ` +RecommendationMinimize and harden what crosses the provider boundary.
Example safer implementation: import { FailoverError } from "../agents/failover-error.js";
function resolveFallbackRetryPrompt(params: {
body: string;
isFallbackRetry: boolean;
previousPartialExecution?: { toolNames: string[]; didSendViaMessagingTool: boolean };
}): string {
if (!params.isFallbackRetry) return params.body;
if (!params.previousPartialExecution) {
return "Continue where you left off. The previous model attempt failed or timed out.";
}
const safeToolNames = FailoverError.sanitizeToolNames(params.previousPartialExecution.toolNames);
const toolCount = safeToolNames.length;
const messagingWarning = params.previousPartialExecution.didSendViaMessagingTool
? " Messages were already sent to the user — do NOT re-send them."
: "";
return [
`[System: The previous model attempt partially executed before failing. ` +
`It executed ${toolCount} tool call(s).${messagingWarning} ` +
`Do not repeat actions that have already been performed. ` +
`Review the conversation history and continue from where the previous attempt left off.]`,
params.body,
].join("\n\n");
}Optionally gate inclusion of tool names behind an explicit debug flag/config that is off by default. Analyzed PR: #43507 at commit Last updated on: 2026-03-11T23:32:54Z |
Greptile SummaryThis PR adds detection of partial tool execution before model fallback retries, threading sanitized tool names from Two logic issues were found:
Confidence Score: 3/5
Prompt To Fix All With AIThis is a comment left during a code review.
Path: src/agents/model-fallback.ts
Line: 733
Comment:
**Partial execution context lost on multi-hop fallback failure**
`lastPartialExecution` is unconditionally overwritten here. If a secondary fallback attempt fails *without* executing any tools (e.g. it hits a rate limit immediately), `described.partialExecution` is `undefined`, which clobbers the context from the original attempt. Any further fallback candidates will then receive no `previousPartialExecution` — meaning they won't know that the primary model already sent a message or ran a shell command.
Example scenario: Attempt 1 runs `send_message` then times out → `lastPartialExecution = { toolNames: ["send_message"], ... }`. Attempt 2 hits a rate limit before executing anything → `lastPartialExecution = undefined`. Attempt 3 never learns about the message sent in Attempt 1 and may re-send it.
```suggestion
lastPartialExecution = described.partialExecution ?? lastPartialExecution;
```
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: src/agents/pi-embedded-runner/run.ts
Line: 1421-1430
Comment:
**Empty `toolNames` after sanitization produces a malformed system prompt**
`attempt.toolMetas.length > 0` only guarantees that there were tool calls before sanitization. If every tool name consists solely of characters stripped by `sanitizeToolNames` (e.g. only emoji, CJK, or punctuation), the resulting `toolNames` array will be `[]`. This gets threaded all the way to `resolveFallbackRetryPrompt` in `agent.ts`, where `[].join(", ")` produces an empty string, resulting in the nonsensical fallback prompt:
> *"It completed these tool calls: . Do not repeat actions that have already been performed."*
This could confuse the fallback model. The `partialExecution` object should only be created when the sanitized names are non-empty:
```
const sanitizedToolNames = FailoverError.sanitizeToolNames(
attempt.toolMetas.map((t) => t.toolName),
);
partialExecution:
sanitizedToolNames.length > 0
? {
hadToolExecution: true,
toolNames: sanitizedToolNames,
didSendViaMessagingTool: attempt.didSendViaMessagingTool,
}
: undefined,
```
How can I resolve this? If you propose a fix, please make it concise.Last reviewed commit: 399026c |
|
|
||
| lastError = isKnownFailover ? normalized : err; | ||
| const described = describeFailoverError(normalized); | ||
| lastPartialExecution = described.partialExecution; |
There was a problem hiding this comment.
Partial execution context lost on multi-hop fallback failure
lastPartialExecution is unconditionally overwritten here. If a secondary fallback attempt fails without executing any tools (e.g. it hits a rate limit immediately), described.partialExecution is undefined, which clobbers the context from the original attempt. Any further fallback candidates will then receive no previousPartialExecution — meaning they won't know that the primary model already sent a message or ran a shell command.
Example scenario: Attempt 1 runs send_message then times out → lastPartialExecution = { toolNames: ["send_message"], ... }. Attempt 2 hits a rate limit before executing anything → lastPartialExecution = undefined. Attempt 3 never learns about the message sent in Attempt 1 and may re-send it.
| lastPartialExecution = described.partialExecution; | |
| lastPartialExecution = described.partialExecution ?? lastPartialExecution; |
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/model-fallback.ts
Line: 733
Comment:
**Partial execution context lost on multi-hop fallback failure**
`lastPartialExecution` is unconditionally overwritten here. If a secondary fallback attempt fails *without* executing any tools (e.g. it hits a rate limit immediately), `described.partialExecution` is `undefined`, which clobbers the context from the original attempt. Any further fallback candidates will then receive no `previousPartialExecution` — meaning they won't know that the primary model already sent a message or ran a shell command.
Example scenario: Attempt 1 runs `send_message` then times out → `lastPartialExecution = { toolNames: ["send_message"], ... }`. Attempt 2 hits a rate limit before executing anything → `lastPartialExecution = undefined`. Attempt 3 never learns about the message sent in Attempt 1 and may re-send it.
```suggestion
lastPartialExecution = described.partialExecution ?? lastPartialExecution;
```
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
Fixed in f36dc8e — lastPartialExecution = described.partialExecution ?? lastPartialExecution preserves earlier context when later attempts fail without tool execution. Added a multi-hop test covering this scenario.
| partialExecution: | ||
| attempt.toolMetas.length > 0 | ||
| ? { | ||
| hadToolExecution: true, | ||
| toolNames: FailoverError.sanitizeToolNames( | ||
| attempt.toolMetas.map((t) => t.toolName), | ||
| ), | ||
| didSendViaMessagingTool: attempt.didSendViaMessagingTool, | ||
| } | ||
| : undefined, |
There was a problem hiding this comment.
Empty toolNames after sanitization produces a malformed system prompt
attempt.toolMetas.length > 0 only guarantees that there were tool calls before sanitization. If every tool name consists solely of characters stripped by sanitizeToolNames (e.g. only emoji, CJK, or punctuation), the resulting toolNames array will be []. This gets threaded all the way to resolveFallbackRetryPrompt in agent.ts, where [].join(", ") produces an empty string, resulting in the nonsensical fallback prompt:
"It completed these tool calls: . Do not repeat actions that have already been performed."
This could confuse the fallback model. The partialExecution object should only be created when the sanitized names are non-empty:
const sanitizedToolNames = FailoverError.sanitizeToolNames(
attempt.toolMetas.map((t) => t.toolName),
);
partialExecution:
sanitizedToolNames.length > 0
? {
hadToolExecution: true,
toolNames: sanitizedToolNames,
didSendViaMessagingTool: attempt.didSendViaMessagingTool,
}
: undefined,
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/pi-embedded-runner/run.ts
Line: 1421-1430
Comment:
**Empty `toolNames` after sanitization produces a malformed system prompt**
`attempt.toolMetas.length > 0` only guarantees that there were tool calls before sanitization. If every tool name consists solely of characters stripped by `sanitizeToolNames` (e.g. only emoji, CJK, or punctuation), the resulting `toolNames` array will be `[]`. This gets threaded all the way to `resolveFallbackRetryPrompt` in `agent.ts`, where `[].join(", ")` produces an empty string, resulting in the nonsensical fallback prompt:
> *"It completed these tool calls: . Do not repeat actions that have already been performed."*
This could confuse the fallback model. The `partialExecution` object should only be created when the sanitized names are non-empty:
```
const sanitizedToolNames = FailoverError.sanitizeToolNames(
attempt.toolMetas.map((t) => t.toolName),
);
partialExecution:
sanitizedToolNames.length > 0
? {
hadToolExecution: true,
toolNames: sanitizedToolNames,
didSendViaMessagingTool: attempt.didSendViaMessagingTool,
}
: undefined,
```
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
Fixed in f36dc8e — now checks sanitizedToolNames.length instead of attempt.toolMetas.length before constructing partialExecution, so all-invalid-char tool names produce undefined instead of an empty-list prompt.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 399026c2ab
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| `It completed these tool calls: ${toolList}.${messagingWarning} ` + | ||
| `Do not repeat actions that have already been performed. ` + | ||
| `Review the conversation history and continue from where the previous attempt left off.]`, | ||
| params.body, |
There was a problem hiding this comment.
Avoid re-injecting original body on partial fallback retries
When previousPartialExecution is present, this path appends params.body into the retry prompt. That prompt is submitted again as a fresh user turn on fallback attempts (activeSession.prompt(...) in src/agents/pi-embedded-runner/run/attempt.ts), so the original request is re-issued exactly in the scenario where tools already ran. This defeats the duplicate-turn protection noted in the fallback flow and increases the chance that the next model restarts work and repeats non-idempotent actions.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Intentional design — not changing this. The partial-execution path must include the original body because the fallback model needs both the task context AND the warning about what already ran. The no-partial-execution path ("Continue where you left off") can drop the body because the session transcript already contains it and no side-effect context is needed. This follows the same pattern as prependInternalEventContext at agent.ts:149-161.
Dropping the body in the partial-execution case would leave the fallback model with only the system warning and no task to continue, which is worse.
|
|
||
| lastError = isKnownFailover ? normalized : err; | ||
| const described = describeFailoverError(normalized); | ||
| lastPartialExecution = described.partialExecution; |
There was a problem hiding this comment.
Preserve partial-execution state when later errors omit it
This assignment clears lastPartialExecution whenever a failed attempt has no partialExecution metadata. In a 3+ candidate fallback chain, if an early candidate executed tools but a middle candidate fails before reporting partial execution, the next retry loses the safety context and no longer gets previousPartialExecution, allowing already-completed side effects to be replayed.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Fixed in f36dc8e — same fix as the Greptile comment above (?? lastPartialExecution).
…y sanitized names - Use `?? lastPartialExecution` so later attempts without tool execution don't clobber the context from an earlier partial execution - Check sanitized tool name length (not pre-sanitization) to avoid malformed prompts when all names are stripped Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f36dc8e7a2
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
|
||
| lastError = isKnownFailover ? normalized : err; | ||
| const described = describeFailoverError(normalized); | ||
| lastPartialExecution = described.partialExecution ?? lastPartialExecution; |
There was a problem hiding this comment.
Accumulate partial-execution context across failed attempts
This assignment replaces the previous partial-execution metadata whenever a later candidate also reports partialExecution, so in a 3+ model chain you can lose earlier side-effect context (for example, attempt 1 sent a message, attempt 2 only ran search, attempt 3 no longer gets the “message already sent” warning). That makes fallback retries more likely to replay non-idempotent actions from earlier attempts; runWithModelFallback should merge tool names and OR didSendViaMessagingTool instead of overwriting.
Useful? React with 👍 / 👎.
This comment was marked as spam.
This comment was marked as spam.
|
Converting to draft — the partial execution warning injection will be redesigned as part of a unified fallback retry approach that uses proper system context rather than user-turn injection. See #43331 (comment) for context on the rethink. |
|
Superseded by #44188 (unified fallback retry safety PR). |
Summary
partialExecutionfield toFailoverErrorwith sanitized tool names (capped at 20 entries, alphanumeric+_-only, 100 char limit per name)model-fallback.ts→agent.tsand prepend a[System: ...]context block to the retry prompt listing completed toolsCloses #43482
Related: #43331 (security review that flagged this gap)
Test plan
pnpm build— type-check passespnpm check— lint/format passesFailoverError.sanitizeToolNames(capping, stripping, truncation, empty filtering)partialExecutionpreservation throughFailoverErrorconstruction anddescribeFailoverErrorrunWithModelFallbackforwardingpreviousPartialExecutionto next run call, omitting when absent, and storing inattempts[]🤖 Generated with Claude Code