Skip to content

Commit ab7c922

Browse files
authored
fix(codex): report completion timeout diagnostics
Surface Codex-specific completion-timeout outcomes and structural diagnostics while preserving the existing replay-safe retry behavior.\n\nVerified with focused Vitest coverage, live forced-timeout Showboat proof, and green PR CI.
1 parent 2fc4511 commit ab7c922

10 files changed

Lines changed: 207 additions & 46 deletions

File tree

docs/plugins/codex-harness-reference.md

Lines changed: 9 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -347,11 +347,15 @@ the session lane. Replay-safe stdio app-server failures, including
347347
turn-completion idle timeouts without assistant, tool, active-item, or
348348
side-effect evidence, are retried once on a fresh app-server attempt. Unsafe
349349
timeouts still retire the stuck app-server client and release the OpenClaw
350-
session lane. They also clear the stale native thread binding and surface a
351-
recoverable timeout message for user or maintainer judgment instead of being
352-
replayed automatically. Timeout diagnostics include the last app-server
353-
notification method and, for raw assistant response items, the item type, role,
354-
id, and a bounded assistant text preview.
350+
session lane. They also clear the stale native thread binding instead of being
351+
replayed automatically. Completion-watch timeouts surface Codex-specific timeout
352+
text: replay-safe cases say the response may be incomplete, while unsafe cases
353+
tell the user to verify current state before retrying. Public timeout diagnostics
354+
include structural fields such as the last app-server notification method,
355+
raw assistant response item id/type/role, active request/item counts, and armed
356+
watch state. When the last notification is a raw assistant response item, they
357+
also include a bounded assistant text preview. They do not include raw prompt or
358+
tool content.
355359

356360
## Model discovery
357361

docs/plugins/codex-harness.md

Lines changed: 9 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -588,11 +588,15 @@ releases the session lane. Replay-safe stdio app-server failures, including
588588
turn-completion idle timeouts without assistant, tool, active-item, or
589589
side-effect evidence, are retried once on a fresh app-server attempt. Unsafe
590590
timeouts still retire the stuck app-server client and release the OpenClaw
591-
session lane. They also clear the stale native thread binding and surface a
592-
recoverable timeout message for user or maintainer judgment instead of being
593-
replayed automatically. Timeout diagnostics include the last app-server
594-
notification method and, for raw assistant response items, the item type, role,
595-
id, and a bounded assistant text preview.
591+
session lane. They also clear the stale native thread binding instead of being
592+
replayed automatically. Completion-watch timeouts surface Codex-specific timeout
593+
text: replay-safe cases say the response may be incomplete, while unsafe cases
594+
tell the user to verify current state before retrying. Public timeout diagnostics
595+
include structural fields such as the last app-server notification method,
596+
raw assistant response item id/type/role, active request/item counts, and armed
597+
watch state. When the last notification is a raw assistant response item, they
598+
also include a bounded assistant text preview. They do not include raw prompt or
599+
tool content.
596600

597601
Environment overrides remain available for local testing:
598602

extensions/codex/src/app-server/attempt-results.test.ts

Lines changed: 20 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -61,14 +61,25 @@ describe("Codex app-server attempt results", () => {
6161
buildCodexAppServerPromptTimeoutOutcome({
6262
result: createResult(),
6363
turnCompletionIdleTimedOut: true,
64+
turnWatchTimeoutKind: "progress",
6465
}),
6566
).toBeUndefined();
6667
expect(
6768
buildCodexAppServerPromptTimeoutOutcome({
6869
result: createResult({
69-
itemLifecycle: { startedCount: 1, completedCount: 1, activeCount: 0 },
70+
toolMetas: [{ toolName: "exec" }],
71+
}),
72+
turnCompletionIdleTimedOut: true,
73+
turnWatchTimeoutKind: "terminal",
74+
}),
75+
).toBeUndefined();
76+
expect(
77+
buildCodexAppServerPromptTimeoutOutcome({
78+
result: createResult({
79+
itemLifecycle: { startedCount: 0, completedCount: 0, activeCount: 0 },
7080
}),
7181
turnCompletionIdleTimedOut: true,
82+
turnWatchTimeoutKind: "completion",
7283
}),
7384
).toEqual({
7485
message:
@@ -83,6 +94,7 @@ describe("Codex app-server attempt results", () => {
8394
},
8495
}),
8596
turnCompletionIdleTimedOut: true,
97+
turnWatchTimeoutKind: "completion",
8698
}),
8799
).toEqual({
88100
message:
@@ -96,21 +108,27 @@ describe("Codex app-server attempt results", () => {
96108
assistantTexts: ["I am changing the data model now..."],
97109
}),
98110
turnCompletionIdleTimedOut: true,
111+
turnWatchTimeoutKind: "completion",
99112
}),
100113
).toEqual({
101114
message:
102115
"Codex stopped before confirming the turn was complete. The response may be incomplete; retry if needed.",
116+
replayInvalid: true,
117+
livenessState: "abandoned",
103118
});
104119
expect(
105120
buildCodexAppServerPromptTimeoutOutcome({
106121
result: createResult({
107122
toolMetas: [{ toolName: "exec" }],
108123
}),
109124
turnCompletionIdleTimedOut: true,
125+
turnWatchTimeoutKind: "completion",
110126
}),
111127
).toEqual({
112128
message:
113-
"Codex stopped before confirming the turn was complete. The response may be incomplete; retry if needed.",
129+
"Codex stopped before confirming the turn was complete. Some work may already have been performed; verify the current state before retrying.",
130+
replayInvalid: true,
131+
livenessState: "abandoned",
114132
});
115133
});
116134

extensions/codex/src/app-server/attempt-results.ts

Lines changed: 12 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@ import type {
88
EmbeddedRunAttemptResult,
99
} from "openclaw/plugin-sdk/agent-harness-runtime";
1010
import type { CodexSystemPromptReport } from "./attempt-context.js";
11+
import type { CodexAttemptTurnWatchTimeoutKind } from "./attempt-turn-watches.js";
1112

1213
const CODEX_APP_SERVER_MISSING_TERMINAL_EVENT_USER_MESSAGE =
1314
"Codex stopped before confirming the turn was complete. The response may be incomplete; retry if needed.";
@@ -19,38 +20,31 @@ export function collectTerminalAssistantText(result: EmbeddedRunAttemptResult):
1920
return result.assistantTexts.join("\n\n").trim();
2021
}
2122

22-
/** Returns whether attempt metadata saw potential side effects. */
23-
export function hasCodexAppServerPotentialSideEffectEvidence(
24-
result: EmbeddedRunAttemptResult,
25-
): boolean {
26-
return result.replayMetadata.hadPotentialSideEffects;
27-
}
28-
2923
/**
3024
* Builds the user-facing timeout outcome when Codex stops without a terminal
3125
* turn event.
3226
*/
3327
export function buildCodexAppServerPromptTimeoutOutcome(params: {
3428
result: EmbeddedRunAttemptResult;
3529
turnCompletionIdleTimedOut: boolean;
30+
turnWatchTimeoutKind?: CodexAttemptTurnWatchTimeoutKind;
3631
}): EmbeddedRunAttemptResult["promptTimeoutOutcome"] {
37-
const completionIdleTimeoutHadPotentialSideEffects = hasCodexAppServerPotentialSideEffectEvidence(
38-
params.result,
39-
);
40-
const replayBlockedReason = resolveCodexAppServerReplayBlockedReason(params.result);
41-
if (
42-
!params.turnCompletionIdleTimedOut ||
43-
(params.result.itemLifecycle.completedCount === 0 &&
44-
!completionIdleTimeoutHadPotentialSideEffects &&
45-
replayBlockedReason === undefined)
46-
) {
32+
if (!params.turnCompletionIdleTimedOut) {
4733
return undefined;
4834
}
35+
if (params.turnWatchTimeoutKind !== undefined && params.turnWatchTimeoutKind !== "completion") {
36+
return undefined;
37+
}
38+
const replayBlockedReason = resolveCodexAppServerReplayBlockedReason(params.result);
39+
const completionIdleTimeoutHadPotentialSideEffects =
40+
replayBlockedReason === "tool_activity" ||
41+
replayBlockedReason === "potential_side_effect" ||
42+
replayBlockedReason === "active_item";
4943
return {
5044
message: completionIdleTimeoutHadPotentialSideEffects
5145
? CODEX_APP_SERVER_MISSING_TERMINAL_EVENT_SIDE_EFFECT_USER_MESSAGE
5246
: CODEX_APP_SERVER_MISSING_TERMINAL_EVENT_USER_MESSAGE,
53-
...(completionIdleTimeoutHadPotentialSideEffects
47+
...(replayBlockedReason
5448
? {
5549
replayInvalid: true,
5650
livenessState: "abandoned" as const,

extensions/codex/src/app-server/attempt-turn-watches.test.ts

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -89,6 +89,14 @@ describe("Codex app-server attempt turn watches", () => {
8989
idleMs: 10,
9090
timeoutMs: 10,
9191
lastActivityReason: "turn:start",
92+
details: {
93+
activeAppServerTurnRequests: 0,
94+
activeTurnItemCount: 0,
95+
terminalTurnNotificationQueued: false,
96+
completionIdleWatchArmed: true,
97+
assistantCompletionIdleWatchArmed: false,
98+
terminalIdleWatchArmed: false,
99+
},
92100
},
93101
]);
94102
expect(harness.abortController.signal.reason).toBe("turn_completion_idle_timeout");

extensions/codex/src/app-server/attempt-turn-watches.ts

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -312,12 +312,21 @@ export function createCodexAttemptTurnWatchController(params: {
312312
scheduleCompletionIdleWatch();
313313
return;
314314
}
315+
const details = {
316+
...completionLastActivityDetails,
317+
activeAppServerTurnRequests: params.getActiveAppServerTurnRequests(),
318+
activeTurnItemCount: params.getActiveTurnItemCount(),
319+
terminalTurnNotificationQueued: params.isTerminalTurnNotificationQueued(),
320+
completionIdleWatchArmed,
321+
assistantCompletionIdleWatchArmed,
322+
terminalIdleWatchArmed,
323+
};
315324
const timeout = {
316325
kind: "completion" as const,
317326
idleMs,
318327
timeoutMs,
319328
lastActivityReason: completionLastActivityReason,
320-
details: completionLastActivityDetails,
329+
details,
321330
};
322331
params.onTimeout(timeout);
323332
params.onMarkTimedOut();

extensions/codex/src/app-server/run-attempt.ts

Lines changed: 80 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1131,6 +1131,10 @@ export async function runCodexAppServerAttempt(
11311131
let timedOut = false;
11321132
let turnCompletionIdleTimedOut = false;
11331133
let turnWatchTimeoutKind: CodexAttemptTurnWatchTimeoutKind | undefined;
1134+
let turnWatchTimeoutIdleMs: number | undefined;
1135+
let turnWatchTimeoutMs: number | undefined;
1136+
let turnWatchTimeoutLastActivityReason: string | undefined;
1137+
let turnWatchTimeoutDetails: Record<string, unknown> | undefined;
11341138
let turnCompletionIdleTimeoutMessage: string | undefined;
11351139
let clientClosedPromptError: string | undefined;
11361140
let clientClosedAbort = false;
@@ -1221,6 +1225,10 @@ export async function runCodexAppServerAttempt(
12211225
timedOut = true;
12221226
turnCompletionIdleTimedOut = true;
12231227
turnWatchTimeoutKind = timeout.kind;
1228+
turnWatchTimeoutIdleMs = timeout.idleMs;
1229+
turnWatchTimeoutMs = timeout.timeoutMs;
1230+
turnWatchTimeoutLastActivityReason = timeout.lastActivityReason;
1231+
turnWatchTimeoutDetails = timeout.details;
12241232
turnCompletionIdleTimeoutMessage =
12251233
"codex app-server turn idle timed out waiting for turn/completed";
12261234
},
@@ -2302,7 +2310,18 @@ export async function runCodexAppServerAttempt(
23022310
const promptTimeoutOutcome = buildCodexAppServerPromptTimeoutOutcome({
23032311
result,
23042312
turnCompletionIdleTimedOut,
2313+
turnWatchTimeoutKind,
23052314
});
2315+
const codexAppServerFailureDiagnostics =
2316+
codexAppServerFailureKind === "turn_completion_idle_timeout" &&
2317+
turnWatchTimeoutKind === "completion"
2318+
? buildCodexAppServerTimeoutDiagnostics({
2319+
idleMs: turnWatchTimeoutIdleMs,
2320+
timeoutMs: turnWatchTimeoutMs,
2321+
lastActivityReason: turnWatchTimeoutLastActivityReason,
2322+
details: turnWatchTimeoutDetails,
2323+
})
2324+
: undefined;
23062325
const modelCallFailureKind =
23072326
classifyCodexModelCallFailureKind({
23082327
error: finalPromptError,
@@ -2470,6 +2489,9 @@ export async function runCodexAppServerAttempt(
24702489
...(codexAppServerReplayBlockedReason
24712490
? { replayBlockedReason: codexAppServerReplayBlockedReason }
24722491
: {}),
2492+
...(codexAppServerFailureDiagnostics
2493+
? { diagnostics: codexAppServerFailureDiagnostics }
2494+
: {}),
24732495
},
24742496
}
24752497
: {}),
@@ -2654,6 +2676,64 @@ function waitForCodexNotificationDispatchTurn(): Promise<void> {
26542676
});
26552677
}
26562678

2679+
function buildCodexAppServerTimeoutDiagnostics(params: {
2680+
idleMs?: number;
2681+
timeoutMs?: number;
2682+
lastActivityReason?: string;
2683+
details?: Record<string, unknown>;
2684+
}): NonNullable<EmbeddedRunAttemptResult["codexAppServerFailure"]>["diagnostics"] {
2685+
const readString = (key: string) => {
2686+
const value = params.details?.[key];
2687+
return typeof value === "string" && value.trim() ? value : undefined;
2688+
};
2689+
const readNumber = (key: string) => {
2690+
const value = params.details?.[key];
2691+
return typeof value === "number" && Number.isFinite(value) ? value : undefined;
2692+
};
2693+
const readBoolean = (key: string) => {
2694+
const value = params.details?.[key];
2695+
return typeof value === "boolean" ? value : undefined;
2696+
};
2697+
return {
2698+
...(params.idleMs !== undefined ? { idleMs: params.idleMs } : {}),
2699+
...(params.timeoutMs !== undefined ? { timeoutMs: params.timeoutMs } : {}),
2700+
...(params.lastActivityReason ? { lastActivityReason: params.lastActivityReason } : {}),
2701+
...(readString("lastNotificationMethod")
2702+
? { lastNotificationMethod: readString("lastNotificationMethod") }
2703+
: {}),
2704+
...(readString("lastNotificationItemId")
2705+
? { lastNotificationItemId: readString("lastNotificationItemId") }
2706+
: {}),
2707+
...(readString("lastNotificationItemType")
2708+
? { lastNotificationItemType: readString("lastNotificationItemType") }
2709+
: {}),
2710+
...(readString("lastNotificationItemRole")
2711+
? { lastNotificationItemRole: readString("lastNotificationItemRole") }
2712+
: {}),
2713+
...(readString("lastAssistantTextPreview")
2714+
? { lastAssistantTextPreview: readString("lastAssistantTextPreview") }
2715+
: {}),
2716+
...(readNumber("activeAppServerTurnRequests") !== undefined
2717+
? { activeAppServerTurnRequests: readNumber("activeAppServerTurnRequests") }
2718+
: {}),
2719+
...(readNumber("activeTurnItemCount") !== undefined
2720+
? { activeTurnItemCount: readNumber("activeTurnItemCount") }
2721+
: {}),
2722+
...(readBoolean("terminalTurnNotificationQueued") !== undefined
2723+
? { terminalTurnNotificationQueued: readBoolean("terminalTurnNotificationQueued") }
2724+
: {}),
2725+
...(readBoolean("completionIdleWatchArmed") !== undefined
2726+
? { completionIdleWatchArmed: readBoolean("completionIdleWatchArmed") }
2727+
: {}),
2728+
...(readBoolean("assistantCompletionIdleWatchArmed") !== undefined
2729+
? { assistantCompletionIdleWatchArmed: readBoolean("assistantCompletionIdleWatchArmed") }
2730+
: {}),
2731+
...(readBoolean("terminalIdleWatchArmed") !== undefined
2732+
? { terminalIdleWatchArmed: readBoolean("terminalIdleWatchArmed") }
2733+
: {}),
2734+
};
2735+
}
2736+
26572737
function handleApprovalRequest(params: {
26582738
method: string;
26592739
params: JsonValue | undefined;

0 commit comments

Comments
 (0)