fix(agents): centralize terminal run outcome precedence by steipete · Pull Request #88136 · openclaw/openclaw

steipete · 2026-05-29T21:04:03Z

Summary

This is a maintainer replacement for #87902's timeout-invariant fix, keeping the useful behavior but centralizing the decision model instead of threading timeout attribution through each observer.

Add a canonical agent run terminal outcome helper for completed, hard timeout, ordinary timed-out, explicit cancellation, aborted, blocked, and failed outcomes.
Reuse that helper in Gateway lifecycle snapshots, Gateway dedupe snapshots, and agent.wait normalization so public projections keep the existing status/error/timing shape.
Preserve sticky precedence only for provider/preflight/post-turn hard timeouts and explicit rpc/stop cancellation payloads, while leaving bare/no-phase/queue/gateway-draining wait timeouts replaceable.
Cover late completion/error/non-terminal overwrite races and pending-grace timeout races.

Credit: Josh Avant (@joshavant) found the cross-surface timeout race class in #87902; this PR keeps that invariant with a smaller canonical state model.

Supersedes #87902.
Fixes #87444.

Verification

Behavior addressed: hard timeout and explicit cancellation terminal facts are no longer overwritten by late abort/rejection/completion/non-terminal dedupe writes; ordinary queue/gateway wait timeouts remain replaceable.

Real environment tested: local OpenClaw checkout plus GitHub Actions PR CI.

Exact steps or command run after this patch:

pnpm test src/agents/agent-run-terminal-outcome.test.ts src/gateway/server-methods/agent-wait-dedupe.test.ts src/gateway/server-methods/server-methods.test.ts src/agents/run-wait.test.ts src/agents/embedded-agent-runner/run/attempt.cwd-split.test.ts src/infra/net/http-connect-tunnel.test.ts -- --reporter=verbose
node scripts/run-tsgo.mjs -p tsconfig.core.json --incremental --tsBuildInfoFile .artifacts/tsgo-cache/core.tsbuildinfo
git diff --check
autoreview --mode branch --base origin/main
gh pr checks 88136 --watch --interval 20

Evidence after fix:

Focused Vitest: 4 shards passed, including agent terminal outcome, gateway wait/dedupe, run-wait, embedded cwd split, and infra CONNECT timeout coverage.
Core typecheck: passed.
Diff whitespace check: passed.
Autoreview: final branch review clean, no accepted/actionable findings.
GitHub Actions PR CI for SHA 4943989e14ad551a603e182a371a74f32a254870: passed.

Observed result after fix: terminal outcome precedence is centralized and covered by focused regression tests for hard timeout stickiness, explicit cancellation stickiness, earlier completion correction, non-terminal dedupe overwrite protection, pending retry-grace late errors, late softer timeout replacement attempts, and bounded CONNECT timeout timers.

What was not tested: no live provider/subagent E2E was run.

clawsweeper · 2026-05-29T21:05:44Z

Codex review: needs real behavior proof before merge. Reviewed May 29, 2026, 6:45 PM ET / 22:45 UTC.

Summary
The PR adds a shared agent run terminal outcome helper and rewires agent.wait plus Gateway lifecycle and dedupe projections around timeout and cancellation precedence.

PR surface: Source +290, Tests +395, Docs +1. Total +686 across 10 files.

Reproducibility: yes. for the reviewed defects: source inspection of PR head shows later timeout scheduling and sibling dedupe-key writes can still bypass the sticky terminal outcome merge. I did not run a live Gateway reproduction in this read-only review.

Review metrics: 1 noteworthy metric.

Terminal projection surfaces: 3 changed: agent.wait, Gateway lifecycle cache, Gateway dedupe. These session-state projections must agree before merge because callers observe terminal run state through all three paths.

Merge readiness
Overall: 🧂 unranked krab
Proof: 🧂 unranked krab
Patch quality: 🧂 unranked krab
Result: blocked until real behavior proof from a real setup is added.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

[P2] Fix the pending-timeout and sibling-key dedupe precedence gaps with focused regression tests.
[P2] Add redacted live Gateway/subagent timeout or cancellation proof after the patch.
Refresh or rebase the dirty branch so mergeability can be evaluated.

Proof guidance:

[P1] Needs real behavior proof before merge: The PR body provides tests, typecheck, autoreview, and Testbox gate output, but it still needs redacted live Gateway/subagent timeout or cancellation logs, terminal output, or linked artifacts after the patch; redact IPs, keys, phone numbers, non-public endpoints, and other private details, then update the PR body to trigger re-review or ask a maintainer to comment @clawsweeper re-review.

Risk before merge

[P2] Merging the branch as-is can still lose a sticky hard-timeout or explicit cancellation terminal fact in pending timeout and sibling dedupe-key races.
[P2] The PR has no after-fix live Gateway/subagent timeout or cancellation proof; the evidence is test/typecheck/Testbox-gate output only.
[P1] The provided live PR context reports mergeable=false/dirty at head 30c18e8, so conflict or rebase repair is needed before landing review can finish.

Maintainer options:

Repair every terminal overwrite path (recommended)
Apply the terminal outcome merge before timeout scheduling and across agent/chat sibling dedupe keys, with focused regression coverage for both races before merge.
Pause behind the broader proven branch
If maintainers prefer the already proof-positive broader approach, pause or close this replacement and continue with Preserve agent hard timeout attribution #87902 after reconciling its size and risk.

Next step before merge

[P1] Human follow-up is needed because the branch has protected maintainer handling, dirty merge state, source-backed P1 repair findings, and a contributor-side real behavior proof gap that automation cannot supply.

Security
Cleared: No concrete security or supply-chain concern was found; the diff does not change dependencies, workflows, secrets handling, package resolution, or downloaded code execution.

Review findings

[P1] Preserve hard timeouts before scheduling later timeouts — src/gateway/server-methods/agent-job.ts:370-374
[P1] Merge sticky dedupe outcomes across sibling keys — src/gateway/server-methods/agent-wait-dedupe.ts:286-305

Review details

Best possible solution:

Repair terminal outcome merging before every pending timeout overwrite and across sibling dedupe keys, refresh the dirty branch, then add redacted live Gateway/subagent timeout or cancellation proof before maintainer merge.

Do we have a high-confidence way to reproduce the issue?

Yes for the reviewed defects: source inspection of PR head shows later timeout scheduling and sibling dedupe-key writes can still bypass the sticky terminal outcome merge. I did not run a live Gateway reproduction in this read-only review.

Is this the best way to solve the issue?

No: the shared helper is the right direction, but the current integration does not apply it before all overwrite paths. The safer solution is to use the helper at the scheduling and cross-key dedupe boundaries, then prove the real Gateway/subagent path.

Full review comments:

[P1] Preserve hard timeouts before scheduling later timeouts — src/gateway/server-methods/agent-job.ts:370-374
This guard only skips later error snapshots. A second lifecycle timeout without hard-timeout metadata can still go through scheduleTimeoutFinish/schedulePendingAgentRunTimeout, clear the pending provider timeout, and let a later result overwrite the hard-timeout cause. Apply the terminal merge before timeout scheduling in both shared and per-waiter pending paths.
Confidence: 0.89
[P1] Merge sticky dedupe outcomes across sibling keys — src/gateway/server-methods/agent-wait-dedupe.ts:286-305
setGatewayDedupeEntry only compares the entry under params.key, but reads later resolve both agent:<runId> and chat:<runId> and pick the fresher terminal snapshot. A hard timeout or cancel under one key can still be overridden by a later completion/error written to the sibling key, so check both sibling entries before accepting terminal overwrites.
Confidence: 0.88

Overall correctness: patch is incorrect
Overall confidence: 0.88

AGENTS.md: found and applied where relevant.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 37ccec0dc7f1.

Label changes

Label justifications:

P1: The PR targets an urgent agent/Gateway timeout race that can misclassify terminal runs and block orchestration workflows.
merge-risk: 🚨 session-state: The diff changes how Gateway wait/dedupe state records terminal run outcomes, and the remaining gaps can stale or mis-associate agent run state.
rating: 🧂 unranked krab: Overall readiness is 🧂 unranked krab; proof is 🧂 unranked krab and patch quality is 🧂 unranked krab.
status: 📣 needs proof: The PR needs real behavior proof before ClawSweeper can clear the contributor ask. Needs real behavior proof before merge: The PR body provides tests, typecheck, autoreview, and Testbox gate output, but it still needs redacted live Gateway/subagent timeout or cancellation logs, terminal output, or linked artifacts after the patch; redact IPs, keys, phone numbers, non-public endpoints, and other private details, then update the PR body to trigger re-review or ask a maintainer to comment @clawsweeper re-review.

Evidence reviewed

PR surface:

Source +290, Tests +395, Docs +1. Total +686 across 10 files.

View PR surface stats

Area	Files	Added	Removed	Net
Source	5	340	50	+290
Tests	4	401	6	+395
Docs	1	1	0	+1
Config	0	0	0	0
Generated	0	0	0	0
Other	0	0	0	0
Total	10	742	56	+686

What I checked:

Repository policy applied: Read the full root AGENTS.md plus scoped src/agents and src/gateway guidance; the review applied the policy requiring whole-surface review and real behavior proof for agent/Gateway state changes. (AGENTS.md:28, 37ccec0dc7f1)
Pending timeout overwrite gap remains: On PR head, the waiter-side preservation check only handles later error snapshots, so a later timeout snapshot can still clear and replace a pending hard timeout before the hard-timeout grace publishes. (src/gateway/server-methods/agent-job.ts:370, 30c18e8e323c)
Sibling dedupe overwrite gap remains: On PR head, setGatewayDedupeEntry compares only the existing entry under params.key, while readTerminalSnapshotFromGatewayDedupe later chooses between agent: and chat:; a fresher sibling write can still override a sticky hard timeout or cancel. (src/gateway/server-methods/agent-wait-dedupe.ts:286, 30c18e8e323c)
Current-main provenance: The current main lifecycle/dedupe wait surfaces involved in the PR were introduced in commit 040f14b, which also added the pending timeout grace and dedupe snapshot behavior this PR modifies. (src/gateway/server-methods/agent-job.ts:87, 040f14b641e6)
Related work context: The PR body supersedes Preserve agent hard timeout attribution #87902 and fixes Gateway should enforce runTimeoutSeconds and emit terminal child.timeout event #87444, but this branch is still open, unmerged, and currently reports mergeable=false/dirty in the provided live PR context. (30c18e8e323c)
Proof gap: The PR body lists focused Vitest, core typecheck, diff whitespace, autoreview, and a Testbox changed-gate attempt, but no redacted live Gateway/subagent timeout or cancellation run after the patch. (30c18e8e323c)

Likely related people:

steipete: Git blame shows the current main Gateway wait/dedupe timeout surfaces involved in this review were introduced by commit 040f14b, and this PR is a follow-up on that same behavior. (role: recent area contributor; confidence: high; commits: 040f14b641e6, 30c18e8e323c; files: src/gateway/server-methods/agent-job.ts, src/gateway/server-methods/agent-wait-dedupe.ts, src/gateway/server-methods/agent.ts)
joshavant: The PR body credits this person with finding the cross-surface timeout race class and supersedes their related open hard-timeout attribution PR. (role: related implementation author; confidence: medium; commits: b16bf36df45c; files: src/gateway/server-methods/agent-job.ts, src/gateway/server-methods/agent-wait-dedupe.ts)

What the crustacean ranks mean

🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works

ClawSweeper keeps one durable marker-backed review comment per issue or PR.
Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
Maintainers can also comment @clawsweeper review to request a fresh review only.
Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 509dc8b19d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-29T21:07:07Z

+      if (
+        kind === "error" &&
+        pendingTimeoutSnapshot &&
+        shouldPreserveTerminalSnapshot(pendingTimeoutSnapshot, snapshot)
+      ) {
+        return;


Preserve pending hard timeouts from late successes

When a provider/preflight hard-timeout lifecycle end is still in the 15s retry-grace window, this new preservation guard only applies to a following error event. If the same run then emits a non-aborted end without an intervening start (the late-completion race this change is trying to centralize), the handler below records and returns the ok snapshot immediately, clearing the pending timeout instead of using mergeAgentRunTerminalOutcome to keep the later hard timeout. This leaves the lifecycle agent.wait path still able to report success after a real hard timeout; apply the same pending-timeout precedence check before accepting successful terminal snapshots.

Useful? React with 👍 / 👎.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1bf42326c6

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-29T21:58:40Z

@@ -229,6 +273,10 @@ function ensureAgentRunListener() {
      schedulePendingAgentRunTimeout(snapshot);


Preserve pending hard timeouts from later timeout events

When a provider/preflight hard timeout is still in the 15s grace window, a second lifecycle end with aborted=true but without hard-timeout metadata takes this snapshot.status === "timeout" branch before the new preservation check below. schedulePendingAgentRunTimeout then clears the pending hard timeout and replaces it with an ordinary timeout, so a subsequent late success/error can overwrite the original provider timeout and agent.wait loses the sticky timeout attribution. Move the shouldPreserveTerminalSnapshot check ahead of timeout scheduling or apply it inside schedulePendingAgentRunTimeout; the per-waiter scheduling path has the same ordering risk.

Useful? React with 👍 / 👎.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 50732691ef

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-29T22:12:33Z

+  const existingOutcome = existingSnapshot
+    ? terminalOutcomeFromWaitSnapshot(existingSnapshot)
+    : undefined;
+  const incomingOutcome = incomingSnapshot
+    ? terminalOutcomeFromWaitSnapshot(incomingSnapshot)
+    : undefined;


Merge sticky dedupe outcomes across sibling keys

When a run has both agent:<runId> and chat:<runId> entries, this new sticky check only compares the existing entry under params.key, even though readTerminalSnapshotFromGatewayDedupe handles that collision by choosing the freshest sibling key. If a provider hard timeout is stored under agent: and a late completion/error is written under chat: (or vice versa), there is no existing sticky outcome for the incoming key, so the write is accepted and the fresher sibling snapshot wins in agent.wait; the hard timeout/cancel invariant this change adds is bypassed for key-collision runs. Compare or merge against both sibling dedupe keys for the run before accepting terminal overwrites.

Useful? React with 👍 / 👎.

This reverts commit 9b4a0c3.

@giodl73-repo

* fix(exec): bind node auto-review commands Co-authored-by: Vincent Koc <25068+vincentkoc@users.noreply.github.com> Co-authored-by: jesse-merhi <79823012+jesse-merhi@users.noreply.github.com> * fix(exec): honor node runtime policy for auto-review Co-authored-by: Vincent Koc <25068+vincentkoc@users.noreply.github.com> Co-authored-by: jesse-merhi <79823012+jesse-merhi@users.noreply.github.com> * fix(exec): harden auto-review prompt boundaries Co-authored-by: Vincent Koc <25068+vincentkoc@users.noreply.github.com> Co-authored-by: jesse-merhi <79823012+jesse-merhi@users.noreply.github.com> * fix(exec): align release validation surfaces Co-authored-by: Vincent Koc <25068+vincentkoc@users.noreply.github.com> Co-authored-by: jesse-merhi <79823012+jesse-merhi@users.noreply.github.com> * fix(exec): align release validation checks Co-authored-by: Vincent Koc <25068+vincentkoc@users.noreply.github.com> Co-authored-by: jesse-merhi <79823012+jesse-merhi@users.noreply.github.com> * test(e2e): repair release docker smoke fixtures Co-authored-by: Vincent Koc <25068+vincentkoc@users.noreply.github.com> Co-authored-by: jesse-merhi <79823012+jesse-merhi@users.noreply.github.com> * fix(exec): resolve auto approvals as runtime Co-authored-by: Vincent Koc <25068+vincentkoc@users.noreply.github.com> Co-authored-by: jesse-merhi <79823012+jesse-merhi@users.noreply.github.com> * ci: relax native OpenAI live proof timing Co-authored-by: Vincent Koc <25068+vincentkoc@users.noreply.github.com> Co-authored-by: jesse-merhi <79823012+jesse-merhi@users.noreply.github.com> * fix(exec): include mode in doctor policy warnings Co-authored-by: Vincent Koc <25068+vincentkoc@users.noreply.github.com> Co-authored-by: jesse-merhi <79823012+jesse-merhi@users.noreply.github.com> * test(release): repair live matrix expectations Co-authored-by: Vincent Koc <25068+vincentkoc@users.noreply.github.com> Co-authored-by: jesse-merhi <79823012+jesse-merhi@users.noreply.github.com> * fix(tts): centralize directive number parsing * fix(provider): bound Vydra and Comfy media downloads * fix(discord): validate error code integers * fix(discord): reject unsafe rate limit headers * ci(release): make plugin publish retries idempotent * perf(agent): lazy load embedded agent cli path * fix(whatsapp): validate inbound timestamps * refactor: share agent harness loader helpers * fix(agents): cap unsafe retry-after delays * perf(agent): defer session resolver for scoped gateway turns * fix(msteams): ignore unsafe retry-after delays * refactor: share store writer queue * fix(slack): reject unsafe inbound timestamps * fix(discord): reject unsafe retry-after delays * fix(qa-matrix): cap fault proxy bodies * fix(discord): bound delivery retry delays * refactor: share cron state parsing * Delete changelog directory * fix(zalouser): reject unsafe inbound timestamps * fix(cli): avoid underscored gateway test export * fix(scripts): cap clawtributor avatar probes * fix(telegram): centralize safe thread id parsing * fix(googlechat): drop invalid inbound timestamps * fix(doctor): label auth health by agent (openclaw#85924) Merged via squash. Prepared head SHA: 8c179fc Co-authored-by: giodl73-repo <235387111+giodl73-repo@users.noreply.github.com> Co-authored-by: giodl73-repo <235387111+giodl73-repo@users.noreply.github.com> Reviewed-by: @giodl73-repo * fix(qqbot): validate token expiry lifetimes * fix(openai): validate codex oauth token lifetimes * refactor: share node pairing surface helpers * fix(anthropic): validate oauth token lifetimes * fix(scripts): cap memory FD repro RPC bodies * fix(github-copilot): validate device code lifetimes * fix(msteams): validate oauth token lifetimes * refactor: share cli help argv scan * fix(github-copilot): validate oauth expiry values * fix(scripts): cap realtime smoke responses * fix(chutes): validate oauth token lifetimes * fix(auto-reply): reuse cli sessions for room events * fix(auto-reply): keep room event cli sessions transient * fix(agent-core): reject invalid session timestamps * fix(scripts): cap Claude usage response reads * refactor: centralize skills subsystem * refactor: move skill lifecycle code into skills subsystem * fix: bound skill index cache invalidation * fix: preserve skill snapshot freshness * fix: preserve preloaded skill snapshot entries * refactor: move session skill loader into skills subsystem * fix: preserve empty skill filter short circuit * fix: align empty default skill filter behavior * fix: align skills branch with upstream tar verbose test * fix: drop stale system prompt override imports * refactor: centralize skills runtime paths * refactor: remove stale agents skills barrel * refactor: use direct skills imports * refactor: organize skills subsystem layout * fix: lint centralized skills subsystem * refactor: split skills index follow-up * refactor: centralize skills subsystem * fix: unblock skills centralization checks * fix: route moved skills tests through unit-fast * refactor: centralize skills runtime tests * refactor: share web secret target selection * refactor: centralize safe expiry parsing * fix(exec): normalize unsafe timeout values * fix: persist Copilot SDK session bindings Persist GitHub Copilot SDK session ids in the plugin-state SQLite store so separate OpenClaw process turns can resume the same Copilot-side session when the compatibility fingerprint still matches. The fingerprint covers provider/model/cwd, resolved agent id, resolved Copilot home, and auth identity. Plugin-state lookup/register/delete failures are non-fatal, stale rows are invalidated, and reset delete failures use an in-process tombstone so reset does not accidentally reuse a durable binding. Also routes the QQBot token POST through the plugin SDK SSRF guard with capture disabled for the secret-bearing request, preserving the current token lifetime validation from main. Verification: focused Copilot and QQBot Vitest suites, raw channel fetch guard, autoreview clean, Blacksmith Testbox pnpm check:changed tbx_01kst9fwjmsfzwaxqatszcbf40, live local Copilot two-turn smoke with the same SDK session id persisted in SQLite. Refs openclaw#88064 * fix(exec): cap node run timeouts * perf(agent): skip plugin validation for gateway dispatch * fix(scripts): cap firecrawl compare HTML reads * fix(xai): normalize unsafe oauth lifetimes * refactor: share e2e text file helpers * fix(google): normalize unsafe oauth expiry * fix(openai): normalize codex device lifetimes * refactor: reuse e2e text tail helper * test(xai): type device-code note mock * fix(minimax): reject unsafe oauth expiry * fix(ci): cap dependency guard error bodies * fix(google-meet): normalize oauth expiry * fix(command): stabilize claude-cli transcript resume (openclaw#81048) Fix claude-cli transcript resume so session-id rotation and transcript flush timing do not drop valid resume state. - Capture the latest claude-cli session_id from JSONL output. - Resolve Claude project transcript paths through the shared canonical project-dir resolver. - Probe transcript content from the actual CLI process cwd. - Thanks @benjamin1492! * refactor: share codex e2e install helpers * fix(feishu): bound streaming token expiry * fix(openshell): cap command timeout config * refactor: centralize timer-safe timeout bounds * refactor: share e2e websocket open helper * fix(minimax): guard oauth token fetches (openclaw#88088) * fix(feishu): normalize app registration poll timers * fix(google): reject unsafe vertex adc lifetimes * fix(scripts): cap npm packument reads * fix(auth): reject unsafe wham reset windows * refactor: share qa report arg parsing * fix(retry): cap unsafe retry delays * fix(sandbox): bound novnc observer token ttl * feat(workboard): add agent coordination tools Summary: - Add Workboard agent coordination tools for list/read/claim/heartbeat/release/comment/proof/unblock flows. - Store artifacts, claims, diagnostics, and notifications in the Workboard SQLite-backed plugin state; surface the new metadata through Gateway, Control UI, docs, and plugin manifest contracts. - Add scoped claim authorization, token redaction, stale diagnostic cleanup, atomic proof artifact writes, and generated i18n metadata. Verification: - pnpm test ui/src/i18n/test/translate.test.ts extensions/browser/src/cli/browser-cli-actions-input/register.element.test.ts extensions/workboard/src/store.test.ts extensions/workboard/src/gateway.test.ts extensions/workboard/src/tools.test.ts ui/src/ui/controllers/workboard.test.ts ui/src/ui/views/workboard.test.ts - pnpm ui:i18n:check - env -u OPENCLAW_TESTBOX pnpm check:changed - autoreview --mode local: clean - PR CI passed; Windows checkout failure rerun passed on attempt 2 * perf(gateway): reuse session maintenance config during turns * fix(node-host): cap timeout wrapper delays * fix(talk): cap fast context timeout delay * fix(e2e): harden kitchen sink probe body caps * refactor: share bounded response reader * fix(providers): cap model request timeout delays * fix(oauth): cap request abort timeout delays * test: speed up slow assertions * test: stabilize slow assertion timings * test: shard channel import guardrails * perf(sessions): patch single-entry store writes * refactor: share script bounded response helper * fix(codex): cap responses request timeout delays * fix(scripts): cap gh-read json bodies * fix(lmstudio): cap model fetch timeout delays * feat(ios): default to hosted push relay (openclaw#88096) Merged via squash. Prepared head SHA: 75f939a Co-authored-by: ngutman <1540134+ngutman@users.noreply.github.com> Co-authored-by: ngutman <1540134+ngutman@users.noreply.github.com> Reviewed-by: @ngutman * fix(minimax): cap tts timeout delays * build(plugins): externalize copilot runtime * refactor: share codex app server start context * test(file-transfer): remove stale tar fixture awaits * fix(runtime): centralize safe timer timeout resolution * refactor: share ui chat send wrapper * docs(plugins): clarify external plugin installs * fix: close native hook relay replacement race * fix(qa-lab): cap credential broker request timeouts * refactor: share e2e incremental line reader * test(ci): fix main test expectations (openclaw#88122) * fix(copilot): cap oauth request timeouts * fix(oauth): cap tls preflight timeout * build(plugins): externalize tokenjuice * docs(plugins): add external package readmes * perf: reuse gateway session and plugin metadata paths * fix(exec): bind node auto-review to prepared plans * fix(auth): cap GitHub Copilot OAuth timeouts * docs(skills): expand Discrawl archive workflow * fix(discord): cap request timeout signals * fix(agents): preserve rotated compaction session identity Fix `sessions.json` persistence after compaction transcript rotation. When the agent runtime rotates from the pre-compaction session transcript to the post-compaction transcript, post-run consumers now receive the effective OpenClaw session id and session file. Backend CLI session ids remain backend metadata and no longer overwrite the top-level OpenClaw session identity. Refs openclaw#88040. Thanks @1052326311. Verification: - `node scripts/run-vitest.mjs src/agents/agent-command.compaction-rotation.test.ts src/agents/agent-command.live-model-switch.test.ts src/agents/command/session-store.test.ts` - Autoreview clean - GitHub CI green on PR head `c3d3c77ddf675bbba0b9ba6681b030a2f69a898c` * fix: keep compaction timeout snapshots continuable * feat(ios): add talk tab realtime playback (openclaw#88105) Merged via squash. Prepared head SHA: f41112a Co-authored-by: ngutman <1540134+ngutman@users.noreply.github.com> Co-authored-by: ngutman <1540134+ngutman@users.noreply.github.com> Reviewed-by: @ngutman * fix(signal): cap container timeout timers * fix(agents): forward ACP spawn attachments Forward initial image/file attachments when spawning ACP subagents through the existing sessions_spawn attachment opt-in. Remove the PR-only acpEnabled config split so ACP uses the same attachment gate as other runtimes. Also fix the PR branch CI fallout: type the browser element CLI request mock and use Vitest env stubs in the Azure speech test to satisfy the changed-path security scan. Verification: - GitHub CI passed on f6ca26b. - Autoreview clean. - Crabbox AWS live OpenAI proof passed: cbx_a576d49493fe / run_081dcc6c6a1b. Thanks @zhangguiping-xydt. * refactor: share e2e bounded response reader * docs(browser): add Notte cloud browser to direct WebSocket CDP providers Notte exposes a CDP-compatible WebSocket gateway at wss://us-prod.notte.cc/sessions/connect?token=<NOTTE_API_KEY> that auto-creates a session on connect — the same shape OpenClaw's existing "Direct WebSocket CDP providers" section was generically framed for (per openclaw#31085). Real behaviour proof (against wss://us-prod.notte.cc/sessions/connect): $ openclaw browser --browser-profile notte open https://example.com opened: https://example.com/ tab: t4 id: 7FE04AC44931A6E1C799DE4ABF0DC807 A screenshot captured against the same session is a 1254x1111 PNG of the rendered example.com page. Playwright connectOverCDP flow against the same URL (today): connectOverCDP 695ms context.newCDPSession(page) 169ms session.send('Target.getTargetInfo') → targetId 87ms page.goto('https://example.com') 631ms total 1.8s AI-assisted (Claude Opus 4.7). codex review --base origin/main returned clean. See PR description for the full pre-flight checklist. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> * fix(zalo): cap api request timeouts * fix: stabilize codex supervisor session listing * fix(qa-matrix): cap substrate request timeouts * fix(xiaomi): cap tts request timeouts * refactor: share e2e mock http helpers * docs(skills): require grouped release changelogs * fix(zai): cap endpoint probe timeouts * fix(mattermost): cap dm retry timeouts * perf: reuse provider handles and strict tool schemas * feat: add core session goals (openclaw#87469) * feat: add core session goals * feat: polish session goals in tui * fix: resolve goal tool session stores * fix: keep get goal read-only * fix: migrate legacy goal session slots * fix: persist goal token accounting * fix: validate goal session rows * refactor: remove unshipped goal legacy handling * fix: handle goal commands in local tui * fix: satisfy goal tool display checks * fix: reset goal budget on overdue resume * feat: surface session goals across control surfaces * test: update gateway protocol test import * test: align goal fixture types with protocol * fix: scope selected global transcript usage fallback * fix: scope selected global web subscriptions * fix: preserve selected global agent during chat dispatch * fix: scope chat inject to selected global agents * test: fix timeout mock return types * fix(crestodian): cap probe timeouts * fix: keep live OpenClaw session locks during cleanup (openclaw#88129) Keep session lock cleanup from removing live OpenClaw-owned locks solely because they are old. Cleanup now reports age-only stale locks without deleting them, while still removing dead, orphaned, recycled, malformed-old, and non-OpenClaw-owned locks. Update doctor docs and regression coverage for the cleanup/repair contract. Refs openclaw#87779 * fix(agents): cap model scan timeouts * refactor: share script budget number parsing * fix(provider): cap operation timeouts * fix(usage): cap provider usage fetch timeouts * fix: bound default heartbeat run timeout (openclaw#88133) Fixes openclaw#87438. Bound unset heartbeat run timeouts so background heartbeat turns no longer inherit the built-in 48-hour interactive agent default. Timeout precedence is explicit heartbeat timeout, explicit global agent timeout, then heartbeat cadence capped at 600 seconds. Verification: - git diff --check - Testbox tbx_01kstna69zvznn4fq7zrqr04a1: corepack pnpm test src/infra/heartbeat-runner.model-override.test.ts -- --reporter=verbose passed 13 tests - Direct node --import tsx runtime probe verified 300s, 600s, 60s, and 45s timeout precedence cases - Autoreview clean Known CI state: - PR CI run 26661465248 has failures matching latest main CI run 26661386468 at a7820b2; failures are outside this six-file heartbeat/docs diff. * fix(signal): cap client request timeouts * fix(feishu): cap async helper timeouts * refactor: share script bounded response reader * fix: move compaction planning off the event loop Move compaction planning work to a bounded worker-thread path so large transcript planning no longer monopolizes the agent event loop. Extract pure planning helpers, sanitize worker inputs before structured clone, package the worker entrypoint, and keep synchronous fallback only for worker-unavailable cases. Fixes openclaw#86358. * fix(browser): cap control fetch timeouts * fix(ci): repair main checks * fix(browser): cap node runtime timeouts * fix(codex-supervisor): centralize session limit parsing * fix(discord): cap monitor helper timeouts * perf: reuse gateway runtime metadata * fix(acp): cap turn timeout timers * refactor: share media temp save wrapper * fix(tts): cap speech provider timeouts * fix(media): cap generation provider timeouts * fix ci mainline checks (openclaw#88137) * fix(infra): cap request body timeouts * ci: stabilize main checks * feat: add skills index * perf: avoid unnecessary skills index maps * refactor: share skill command exposure policy * perf: centralize skill status lookup * refactor: reuse shared skills prompt formatter * perf: reuse resolved skills allowlist * perf: speed up skills filtering * perf: prepare bundled skill allowlist once * perf: use set for bundled skill allowlist * test: preserve real skills status exports * test: share skills entry fixtures * test: remove duplicate skill fixture wrappers * test: complete skills status mock surface * fix(gateway-client): cap stop wait timeout * perf: prefer package-local bundled plugin artifacts * fix(openai): cap codex oauth preflight timeout * fix(supervisor): narrow stored session limit parsing * refactor: share diagnostics timeline span helpers * fix(ci): repair main checks * fix(ci): break skills loading cycle * test: fix main CI regressions * fix(apns): cap relay timeout * fix(infra): cap jsonl socket timeouts * fix(infra): cap shell env timeouts * test: stabilize remaining CI flakes * fix(apns): cap direct timeout paths * Add plugin manifest contract for SecretRef provider integrations (openclaw#82326) * secret-provider-integrations Signed-off-by: sallyom <somalley@redhat.com> * feat(secrets): configure plugin provider presets * secrets: use plugin-managed provider refs Signed-off-by: sallyom <somalley@redhat.com> * fix secretref auth profile service env * test secret provider integration e2e * fix secretref plugin config service env * fix secret provider preset schema alignment * stabilize secret provider service proof * validate secret provider plugin integrations * harden secret provider resolver paths * scope secret provider config validation * stabilize openai secret provider proof * fix secret provider metadata proof * stabilize config baseline proof * fix secret provider e2e lint --------- Signed-off-by: sallyom <somalley@redhat.com> Co-authored-by: joshavant <830519+joshavant@users.noreply.github.com> * fix(proxy): cap connect tunnel timeouts * fix: route media completions through requester agent (openclaw#88141) * fix(scripts): cap issue labeler response bodies * refactor: share media understanding post params * fix(infra): cap transport readiness timeouts * ci: reduce main workflow critical path * test(gateway): stabilize live helper shard * refactor: share native approval route gates Share native approval route gate helpers across mainstream channel approval runtimes and keep PR openclaw#87770 green on current main. * fix(channels): centralize stall watchdog timer bounds * perf: resolve native esm plugin sdk imports * test: stabilize infra state shard * fix(nostr): cap profile import relay timers * test(infra): stabilize main CI tests * test(infra): preserve script wrapper fixture * fix(web): cap guarded fetch timeout seconds * fix(zalouser): cap probe timeout timer * refactor: add shared sqlite state database Adds the shared SQLite state database base, moves plugin keyed state into it with doctor migration coverage, and keeps generated Kysely guardrails aligned. Proof: focused SQLite/plugin-state tests, db:kysely:check, lint:kysely, architecture/dependency guards, autoreview, and PR CI all clean. * fix(codex): recover app-server completion stalls Fix Codex app-server completion-stall recovery so replay-safe stdio completion-idle failures retry once, while progress/terminal turn-watch timeouts only surface timeout payloads. Also preserve post-tool completion guards for scoped native response deltas and stabilize the oversized CONNECT timeout regression test picked up from latest main. Co-authored-by: Kelaw - Keshav's Agent <keshavbotagent@gmail.com> * fix(ci): repair main normalization checks * fix(zalouser): cap qr login timeouts * fix(dev): cap Discord smoke response bodies * fix(agents): centralize terminal run outcome precedence (openclaw#88136) * fix(agents): centralize terminal run outcome precedence * docs(agents): explain terminal outcome precedence * docs(agents): note terminal outcome helper * fix(agents): preserve pending hard timeout over late completion * test(agents): align global session scoping expectation * Revert "test(agents): align global session scoping expectation" This reverts commit 9b4a0c3. * test(infra): stabilize CONNECT timeout cap test * fix(agents): prioritize hard timeout terminal evidence * fix(gateway): preserve pending hard timeout snapshots * ci: skip bundled dts in artifact build * fix(memory): cap qmd process timeouts * fix(ci): repair main lint gates * test(infra): avoid max fake-timer jumps (openclaw#88155) * fix(whatsapp): cap credential flush timeout * ci: satisfy build profile lint * refactor: share live transport scenario helpers * fix(telegram): cap polling lease wait timer * fix(release): avoid gh api for candidate reads * fix(release): harden candidate run status polling * fix(feishu): reopen retryable bot menu replay * fix(release): avoid gh api in beta smoke * fix(release): build beta smoke REST curl command * test(realtime): stabilize websocket timeout test * test: stabilize realtime websocket timeout * fix(telegram): centralize positive timer bounds * fix(providers): cap local service timers * refactor: share provider oauth runtime helpers * fix(openrouter): cap music stream timeout * fix(release): harden release ci summary lookup * fix(fal): cap video queue deadline * test(ci): stabilize tool search gateway timeout helper * fix(reply): hide ACP tool traces from Telegram Telegram's surface renders tool-call traces poorly compared to Discord's. Add a per-channel visibility isolation list (currently just `telegram`) so the dispatch-acp delivery coordinator drops tool/status payloads to those channels and rewrites error payloads to a sanitized message that points to local logs instead of leaking the trace. - New ACP_VISIBILITY_ISOLATED_CHANNELS set + helper prepareAcpPayloadForChannelVisibility - Coordinator picks the effective target channel (originating or direct) and skips delivery when the payload is a tool / status / error trace - 89 lines of test coverage in dispatch-acp.test.ts for the new path --------- Signed-off-by: sallyom <somalley@redhat.com> Co-authored-by: joshavant <830519+joshavant@users.noreply.github.com> Co-authored-by: Vincent Koc <25068+vincentkoc@users.noreply.github.com> Co-authored-by: jesse-merhi <79823012+jesse-merhi@users.noreply.github.com> Co-authored-by: Peter Steinberger <steipete@gmail.com> Co-authored-by: Vincent Koc <vincentkoc@ieee.org> Co-authored-by: Shadow <shadow@openclaw.ai> Co-authored-by: Gio Della-Libera <giodl73@gmail.com> Co-authored-by: giodl73-repo <235387111+giodl73-repo@users.noreply.github.com> Co-authored-by: Ayaan Zaidi <hi@obviy.us> Co-authored-by: Shakker <shakkerdroid@gmail.com> Co-authored-by: Peter Steinberger <peter@steipete.me> Co-authored-by: benjamin1492 <35176637+benjamin1492@users.noreply.github.com> Co-authored-by: Nimrod Gutman <nimrod.gutman@gmail.com> Co-authored-by: ngutman <1540134+ngutman@users.noreply.github.com> Co-authored-by: Dallin Romney <dallinromney@gmail.com> Co-authored-by: xin zhuang <65798732+1052326311@users.noreply.github.com> Co-authored-by: zhang-guiping <zhang.guiping@xydigit.com> Co-authored-by: Lucas Giordano <giordano3102lucas@gmail.com> Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> Co-authored-by: Sally O'Malley <somalley@redhat.com> Co-authored-by: Kevin Lin <kevin@dendron.so> Co-authored-by: keshavbotagent <keshavbotagent@gmail.com>

* fix(agents): centralize terminal run outcome precedence * docs(agents): explain terminal outcome precedence * docs(agents): note terminal outcome helper * fix(agents): preserve pending hard timeout over late completion * test(agents): align global session scoping expectation * Revert "test(agents): align global session scoping expectation" This reverts commit 9b4a0c3. * test(infra): stabilize CONNECT timeout cap test * fix(agents): prioritize hard timeout terminal evidence * fix(gateway): preserve pending hard timeout snapshots

openclaw-barnacle Bot added gateway Gateway runtime agents Agent runtime and tooling size: L maintainer Maintainer-authored PR labels May 29, 2026

chatgpt-codex-connector Bot reviewed May 29, 2026

View reviewed changes

steipete force-pushed the steipete/canonical-agent-terminal-timeouts branch from 8607381 to 1bf4232 Compare May 29, 2026 21:54

openclaw-barnacle Bot added the extensions: codex-supervisor Extension: codex-supervisor label May 29, 2026

chatgpt-codex-connector Bot reviewed May 29, 2026

View reviewed changes

steipete force-pushed the steipete/canonical-agent-terminal-timeouts branch from 1bf4232 to 0fc084d Compare May 29, 2026 22:04

openclaw-barnacle Bot added scripts Repository scripts and removed extensions: codex-supervisor Extension: codex-supervisor labels May 29, 2026

steipete force-pushed the steipete/canonical-agent-terminal-timeouts branch from 0fc084d to 5073269 Compare May 29, 2026 22:07

openclaw-barnacle Bot removed the scripts Repository scripts label May 29, 2026

chatgpt-codex-connector Bot reviewed May 29, 2026

View reviewed changes

steipete force-pushed the steipete/canonical-agent-terminal-timeouts branch from 5073269 to 264fdf9 Compare May 29, 2026 22:14

clawsweeper Bot added rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. and removed rating: 🦪 silver shellfish Thin PR readiness signal; proof, validation, or implementation needs work. labels May 29, 2026

steipete force-pushed the steipete/canonical-agent-terminal-timeouts branch from 4381333 to 30c18e8 Compare May 29, 2026 22:39

steipete added 3 commits May 29, 2026 23:49

fix(agents): centralize terminal run outcome precedence

e7c270e

docs(agents): explain terminal outcome precedence

370d81d

docs(agents): note terminal outcome helper

599e736

steipete added 6 commits May 29, 2026 23:49

fix(agents): preserve pending hard timeout over late completion

efdf330

test(agents): align global session scoping expectation

cecb4d4

Revert "test(agents): align global session scoping expectation"

d84900c

This reverts commit 9b4a0c3.

test(infra): stabilize CONNECT timeout cap test

f3af76f

fix(agents): prioritize hard timeout terminal evidence

b94cf25

fix(gateway): preserve pending hard timeout snapshots

4943989

steipete force-pushed the steipete/canonical-agent-terminal-timeouts branch from 30c18e8 to 4943989 Compare May 29, 2026 22:49

steipete merged commit b1e5c9d into main May 29, 2026
109 checks passed

steipete deleted the steipete/canonical-agent-terminal-timeouts branch May 29, 2026 22:56

github-actions Bot mentioned this pull request May 29, 2026

📡 Upstream Digest — 2026-05-29 23:06 UTC curtismercier/openclaw-mods#974

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(agents): centralize terminal run outcome precedence#88136

fix(agents): centralize terminal run outcome precedence#88136
steipete merged 9 commits into
mainfrom
steipete/canonical-agent-terminal-timeouts

steipete commented May 29, 2026 •

edited

Loading

Uh oh!

clawsweeper Bot commented May 29, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot May 29, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot May 29, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot May 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		@@ -229,6 +273,10 @@ function ensureAgentRunListener() {
		schedulePendingAgentRunTimeout(snapshot);

Uh oh!

Conversation

steipete commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Verification

Uh oh!

clawsweeper Bot commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 29, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 29, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 29, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

steipete commented May 29, 2026 •

edited

Loading

clawsweeper Bot commented May 29, 2026 •

edited

Loading