Skip to content

Commit a92e2b1

Browse files
authored
fix(agents): detect incomplete tool-use turns with pre-tool text (#76477) (#76544)
* fix(agents): detect incomplete tool-use turns with pre-tool text (#76477) When the last assistant message ended with stopReason=toolUse, pre-tool text alone (payloadCount > 0) was suppressing the incomplete-turn guard. The model expected to continue after tool results but the post-tool response was never produced, silently dropping the final answer. Fix isIncompleteTerminalAssistantTurn to always flag toolUse stop reason as incomplete regardless of pre-tool text, and update the early-return condition in resolveIncompleteTurnPayloadText to not skip the check when the last assistant ended with a tool call. * fix(agents): mark tool-use terminal with pre-tool text as abandoned in lifecycle (#76477) The lifecycle handler's derivedWorkingTerminalState was emitting 'working' for interrupted tool-use turns with pre-tool text because it required !hasAssistantVisibleText for the 'abandoned' state. Update the derivation to also mark as 'abandoned' when incompleteTerminalAssistant is true, so lifecycle consumers see a consistent state with the runner's terminal result.
1 parent 79f77d8 commit a92e2b1

5 files changed

Lines changed: 181 additions & 3 deletions

File tree

CHANGELOG.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -53,6 +53,7 @@ Docs: https://docs.openclaw.ai
5353
- Channels/secrets: resolve SecretRef-backed channel credentials through external plugin secret contracts after the plugin split, covering runtime startup, target discovery, webhook auth, disabled-account enumeration, and late-bound web_search config. Fixes #76371. (#76449) Thanks @joshavant and @neeravmakwana.
5454
- Docker/Gateway: pass Docker setup `.env` values into gateway and CLI containers and preserve exec SecretRef `passEnv` keys in managed service plans, so 1Password Connect-backed Discord tokens keep resolving after doctor or plugin repair. Thanks @vincentkoc.
5555
- Control UI/WebChat: explain compaction boundaries in chat history and link directly to session checkpoint controls so pre-compaction turns no longer look silently lost after refresh. Fixes #76415. Thanks @BunsDev.
56+
- Agents/incomplete-turn: detect and surface a warning when the agent's final text after a tool-call chain is silently dropped because the post-tool assistant response was never produced, instead of completing the turn with only the pre-tool analysis text. Fixes #76477. Thanks @amknight.
5657
- Channels/WhatsApp: attach native outbound mention metadata for group text and media captions by resolving `@+<digits>` and `@<digits>` tokens against WhatsApp participant data, including LID groups. Fixes #39879; carries forward #56863. Thanks @kengi1437, @joe2643, and @fridayck.
5758
- Channels/WhatsApp: require outbound mention tokens to end at a word boundary so phone-number prefixes inside longer strings no longer trigger hidden native mentions.
5859
- Plugins/uninstall: remove empty managed git install parent directories after deleting cloned plugin repos and cover npm/git uninstall residue in Docker plugin lifecycle tests. Thanks @vincentkoc.

src/agents/pi-embedded-runner/run.incomplete-turn.test.ts

Lines changed: 131 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,7 @@ import {
2626
resolveEmptyResponseRetryInstruction,
2727
resolvePlanningOnlyRetryLimit,
2828
resolvePlanningOnlyRetryInstruction,
29+
isIncompleteTerminalAssistantTurn,
2930
resolveIncompleteTurnPayloadText,
3031
resolveReasoningOnlyRetryInstruction,
3132
STRICT_AGENTIC_BLOCKED_TEXT,
@@ -995,6 +996,136 @@ describe("runEmbeddedPiAgent incomplete-turn safety", () => {
995996
).toBe("abandoned");
996997
});
997998

999+
it("flags tool-use stop reason as incomplete even when pre-tool text exists (#76477)", () => {
1000+
expect(
1001+
isIncompleteTerminalAssistantTurn({
1002+
hasAssistantVisibleText: true,
1003+
lastAssistant: { stopReason: "toolUse" },
1004+
}),
1005+
).toBe(true);
1006+
expect(
1007+
isIncompleteTerminalAssistantTurn({
1008+
hasAssistantVisibleText: false,
1009+
lastAssistant: { stopReason: "toolUse" },
1010+
}),
1011+
).toBe(true);
1012+
expect(
1013+
isIncompleteTerminalAssistantTurn({
1014+
hasAssistantVisibleText: true,
1015+
lastAssistant: { stopReason: "end_turn" },
1016+
}),
1017+
).toBe(false);
1018+
});
1019+
1020+
it("detects tool-use terminal turn with pre-tool text as incomplete (#76477)", () => {
1021+
// When the last assistant message ended with stopReason=toolUse, pre-tool
1022+
// text alone must not suppress the incomplete-turn guard. The model
1023+
// expected to continue after tool results but the post-tool response was
1024+
// never produced.
1025+
const incompleteTurnText = resolveIncompleteTurnPayloadText({
1026+
payloadCount: 1,
1027+
aborted: false,
1028+
timedOut: false,
1029+
attempt: makeAttemptResult({
1030+
assistantTexts: ["Initial analysis of the codebase..."],
1031+
toolMetas: [{ toolName: "read", meta: "path=src/index.ts" }],
1032+
lastAssistant: {
1033+
role: "assistant",
1034+
stopReason: "toolUse",
1035+
provider: "anthropic",
1036+
model: "sonnet-4.6",
1037+
content: [
1038+
{ type: "text", text: "Initial analysis of the codebase..." },
1039+
{ type: "tool_use", id: "tool_1", name: "read", input: { path: "src/index.ts" } },
1040+
],
1041+
} as unknown as EmbeddedRunAttemptResult["lastAssistant"],
1042+
}),
1043+
});
1044+
1045+
expect(incompleteTurnText).not.toBeNull();
1046+
expect(incompleteTurnText).toContain("couldn't generate a response");
1047+
});
1048+
1049+
it("surfaces tool-use terminal with pre-tool text and side effects as replay-unsafe (#76477)", () => {
1050+
const incompleteTurnText = resolveIncompleteTurnPayloadText({
1051+
payloadCount: 1,
1052+
aborted: false,
1053+
timedOut: false,
1054+
attempt: makeAttemptResult({
1055+
assistantTexts: ["Let me update the file..."],
1056+
toolMetas: [{ toolName: "write" }],
1057+
lastAssistant: {
1058+
role: "assistant",
1059+
stopReason: "toolUse",
1060+
provider: "openai",
1061+
model: "gpt-5.4",
1062+
content: [
1063+
{ type: "text", text: "Let me update the file..." },
1064+
{ type: "tool_use", id: "tool_1", name: "write", input: {} },
1065+
],
1066+
} as unknown as EmbeddedRunAttemptResult["lastAssistant"],
1067+
}),
1068+
});
1069+
1070+
expect(incompleteTurnText).toContain("verify before retrying");
1071+
});
1072+
1073+
it("does not flag a completed tool-use turn with end_turn as incomplete (#76477)", () => {
1074+
// When the model successfully produces post-tool text, lastAssistant has
1075+
// stopReason=end_turn. The incomplete-turn guard should not fire.
1076+
const incompleteTurnText = resolveIncompleteTurnPayloadText({
1077+
payloadCount: 2,
1078+
aborted: false,
1079+
timedOut: false,
1080+
attempt: makeAttemptResult({
1081+
assistantTexts: ["Initial analysis...", "Here is the final answer."],
1082+
toolMetas: [{ toolName: "read" }],
1083+
lastAssistant: {
1084+
role: "assistant",
1085+
stopReason: "end_turn",
1086+
provider: "anthropic",
1087+
model: "sonnet-4.6",
1088+
content: [{ type: "text", text: "Here is the final answer." }],
1089+
} as unknown as EmbeddedRunAttemptResult["lastAssistant"],
1090+
}),
1091+
});
1092+
1093+
expect(incompleteTurnText).toBeNull();
1094+
});
1095+
1096+
it("surfaces an error for tool-use terminal turn with pre-tool text via runEmbeddedPiAgent (#76477)", async () => {
1097+
mockedClassifyFailoverReason.mockReturnValue(null);
1098+
mockedRunEmbeddedAttempt.mockResolvedValueOnce(
1099+
makeAttemptResult({
1100+
assistantTexts: ["Initial analysis of the issue..."],
1101+
toolMetas: [{ toolName: "read", meta: "path=src/index.ts" }],
1102+
lastAssistant: {
1103+
stopReason: "toolUse",
1104+
provider: "anthropic",
1105+
model: "sonnet-4.6",
1106+
content: [
1107+
{ type: "text", text: "Initial analysis of the issue..." },
1108+
{ type: "tool_use", id: "tool_1", name: "read", input: { path: "src/index.ts" } },
1109+
],
1110+
} as unknown as EmbeddedRunAttemptResult["lastAssistant"],
1111+
}),
1112+
);
1113+
1114+
const result = await runEmbeddedPiAgent({
1115+
...overflowBaseRunParams,
1116+
provider: "anthropic",
1117+
model: "sonnet-4.6",
1118+
runId: "run-tool-use-dropped-final-text",
1119+
});
1120+
1121+
expect(mockedRunEmbeddedAttempt).toHaveBeenCalledTimes(1);
1122+
expect(result.payloads?.[0]?.isError).toBe(true);
1123+
expect(result.payloads?.[0]?.text).toContain("couldn't generate a response");
1124+
expect(mockedLog.warn).toHaveBeenCalledWith(
1125+
expect.stringContaining("incomplete turn detected"),
1126+
);
1127+
});
1128+
9981129
it("treats missing replay metadata as replay-invalid", () => {
9991130
const attempt = makeAttemptResult();
10001131
delete (attempt as Partial<EmbeddedRunAttemptResult>).replayMetadata;

src/agents/pi-embedded-runner/run/incomplete-turn.ts

Lines changed: 14 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -90,7 +90,12 @@ export function isIncompleteTerminalAssistantTurn(params: {
9090
hasAssistantVisibleText: boolean;
9191
lastAssistant?: { stopReason?: string } | null;
9292
}): boolean {
93-
return !params.hasAssistantVisibleText && params.lastAssistant?.stopReason === "toolUse";
93+
// A tool-use stop reason means the model issued a tool call and expected
94+
// to continue after tool results. If the session ended before the
95+
// post-tool assistant message arrived, the turn is incomplete regardless
96+
// of whether pre-tool text exists — that text is preliminary analysis,
97+
// not the final answer. (#76477)
98+
return params.lastAssistant?.stopReason === "toolUse";
9499
}
95100

96101
const PLANNING_ONLY_PROMISE_RE =
@@ -220,8 +225,15 @@ export function resolveIncompleteTurnPayloadText(params: {
220225
timedOut: boolean;
221226
attempt: IncompleteTurnAttempt;
222227
}): string | null {
228+
// Tool-use terminal guard: when the last assistant message ended with a
229+
// tool-call stop reason, the model expected to continue after tool results.
230+
// Pre-tool text alone (payloadCount > 0) must not suppress the incomplete-
231+
// turn check in that case — the final post-tool response was never
232+
// produced. (#76477)
233+
const toolUseTerminal = params.attempt.lastAssistant?.stopReason === "toolUse";
234+
223235
if (
224-
params.payloadCount !== 0 ||
236+
(params.payloadCount !== 0 && !toolUseTerminal) ||
225237
params.aborted ||
226238
params.timedOut ||
227239
params.attempt.clientToolCalls ||

src/agents/pi-embedded-subscribe.handlers.lifecycle.test.ts

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -289,6 +289,34 @@ describe("handleAgentEnd", () => {
289289
});
290290
});
291291

292+
it("marks tool-use terminal with pre-tool text as abandoned (#76477)", async () => {
293+
const onAgentEvent = vi.fn();
294+
const ctx = createContext(
295+
{
296+
role: "assistant",
297+
stopReason: "toolUse",
298+
content: [
299+
{ type: "text", text: "Initial analysis..." },
300+
{ type: "tool_use", id: "tool_1", name: "read", input: { path: "src/index.ts" } },
301+
],
302+
},
303+
{ onAgentEvent },
304+
);
305+
ctx.state.livenessState = "working";
306+
ctx.state.assistantTexts = ["Initial analysis..."];
307+
308+
await handleAgentEnd(ctx);
309+
310+
expect(onAgentEvent).toHaveBeenCalledWith({
311+
stream: "lifecycle",
312+
data: {
313+
phase: "end",
314+
livenessState: "abandoned",
315+
replayInvalid: true,
316+
},
317+
});
318+
});
319+
292320
it("keeps accumulated deterministic side effects from being marked abandoned", async () => {
293321
const onAgentEvent = vi.fn();
294322
const ctx = createContext(undefined, { onAgentEvent });

src/agents/pi-embedded-subscribe.handlers.lifecycle.ts

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -54,9 +54,15 @@ export function handleAgentEnd(ctx: EmbeddedPiSubscribeContext): void | Promise<
5454
});
5555
const replayInvalid =
5656
ctx.state.replayState.replayInvalid || incompleteTerminalAssistant ? true : undefined;
57+
// Tool-use terminal guard: when the last assistant message ended with a
58+
// tool-call stop reason, the turn is incomplete even when pre-tool text
59+
// exists — mark as abandoned so lifecycle consumers do not see a working
60+
// end state for an interrupted tool chain. (#76477)
5761
const derivedWorkingTerminalState = isError
5862
? "blocked"
59-
: replayInvalid && !hasAssistantVisibleText && !hadDeterministicSideEffect
63+
: replayInvalid &&
64+
!hadDeterministicSideEffect &&
65+
(!hasAssistantVisibleText || incompleteTerminalAssistant)
6066
? "abandoned"
6167
: ctx.state.livenessState;
6268
const livenessState =

0 commit comments

Comments
 (0)