Fix stuck-session recovery aborts for active reply runs by SnowSky1 · Pull Request #88925 · openclaw/openclaw

SnowSky1 · 2026-06-01T05:09:24Z

Summary

preserve stuck_recovery through reply-run session-id fallback and embedded-run cancel plumbing so recovery no longer degrades into a fake user abort
require stale tracked progress before active embedded/reply work is abort-drained, even when active-abort recovery is enabled
add regressions for fresh terminal progress, fresh active reply work, and streamed model progress so long-but-active runs stay alive

Testing

node scripts/run-vitest.mjs run --config test/vitest/vitest.auto-reply-reply.config.ts src/auto-reply/reply/reply-run-registry.test.ts
node scripts/run-vitest.mjs run --config test/vitest/vitest.logging.config.ts src/logging/diagnostic.test.ts src/logging/diagnostic-stuck-session-recovery.integration.test.ts
node scripts/run-vitest.mjs run --config test/vitest/vitest.agents-embedded-agent.config.ts src/agents/embedded-agent-runner/run/attempt.model-diagnostic-events.test.ts

Real behavior proof

Behavior or issue addressed: Verify that this patch preserves stuck_recovery through the reply-run session-id fallback and does not abort a live embedded run that still has fresh tracked progress.
Real environment tested: Local OpenClaw checkout on Windows 11 using the real modules from this PR branch via a Node/tsx runtime harness after commit 159943a6c0.
Exact steps or command run after this patch: Ran pnpm exec tsx - against the real repo modules in this branch with a runtime harness that (1) creates a reply operation and aborts it through abortReplyRunBySessionId(..., { reason: "stuck_recovery" }), then (2) registers an active embedded run, records fresh progress with markDiagnosticRunProgressForTest(...), and calls recoverStuckDiagnosticSession(...) with allowActiveAbort: true.
Evidence after fix: Terminal output from the local OpenClaw runtime harness after the fix:
[reply-registry]
{"aborted":true,"result":{"kind":"aborted","code":"aborted_by_system","reason":"stuck_recovery"},"abortSignalReason":"Reply operation aborted by system reason=stuck_recovery"}
[stuck-recovery]
{"snapshotBefore":{"activeWorkKind":"embedded_run","hasActiveEmbeddedRun":true,"lastProgressAgeMs":1,"lastProgressReason":"model chunk observed from runtime harness"},"outcome":{"status":"skipped","action":"observe_only","reason":"active_embedded_run","sessionId":"session-live-progress","sessionKey":"agent:main:main","activeSessionId":"session-live-progress","activeWorkKind":"embedded_run"},"abortCount":0,"cancelReasons":[],"snapshotAfter":{"activeWorkKind":"embedded_run","hasActiveEmbeddedRun":true,"lastProgressAgeMs":84,"lastProgressReason":"model chunk observed from runtime harness"}}
[diagnostic] stuck session recovery skipped: sessionId=session-live-progress sessionKey=agent:main:main age=180s queueDepth=1 activeSessionId=session-live-progress
[diagnostic] stuck session recovery outcome: status=skipped action=observe_only sessionId=session-live-progress sessionKey=agent:main:main activeSessionId=session-live-progress activeWorkKind=embedded_run reason=active_embedded_run
Observed result after fix: The reply-run fallback keeps reason: "stuck_recovery" instead of degrading into a user-abort shape. The live-progress recovery path returns status: "skipped", leaves the embedded run active, and does not call abort or cancel on the active handle.
What was not tested: I did not rerun a multi-minute end-to-end Feishu + claude-cli channel repro in this environment, and I did not capture UI/video proof from a live chat surface.

Notes

diagnostics.stuckSessionWarnMs / diagnostics.stuckSessionAbortMs and the default warnMs x 3 / minimum 5 minute behavior are already documented in the current docs/config help, so this PR focuses on the runtime regressions from #88870.

Closes #88870

clawsweeper · 2026-06-01T05:12:00Z

Codex review: needs real behavior proof before merge. Reviewed June 11, 2026, 12:57 AM ET / 04:57 UTC.

Summary
Review failed before ClawSweeper could summarize the requested change.

PR surface: Source +129, Tests +153. Total +282 across 10 files.

Reproducibility: unclear. The review failed before ClawSweeper could establish a reproduction path.

Review metrics: none identified.

Merge readiness
Overall: 🌊 off-meta tidepool
Proof: 🌊 off-meta tidepool
Patch quality: 🌊 off-meta tidepool
Result: rating does not apply to this item.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Risk before merge

[P1] No close action taken because the review did not complete.

Maintainer options:

Decide the mitigation before merge
Retry the Codex review after fixing the execution failure.
Pause or close
Do not merge this PR until maintainers decide whether the risk is worth taking.

Next step before merge

[P1] Review did not complete, so no work-lane recommendation was made.

Review details

Best possible solution:

Retry the Codex review after fixing the execution failure.

Do we have a high-confidence way to reproduce the issue?

Unclear. The review failed before ClawSweeper could establish a reproduction path.

Is this the best way to solve the issue?

Unclear. Retry the review first so ClawSweeper can evaluate the actual issue and fix direction.

AGENTS.md: unclear because the file could not be read completely.

Codex review notes: model internal, reasoning high; reviewed against 418d7e1e83c5.

Label changes

Label changes:

add rating: 🌊 off-meta tidepool: Overall readiness is 🌊 off-meta tidepool; proof is 🌊 off-meta tidepool and patch quality is 🌊 off-meta tidepool.
remove P2: Current review triage priority is none.
remove rating: 🐚 platinum hermit: Current PR rating is rating: 🌊 off-meta tidepool, so this older rating label is no longer current.
remove merge-risk: 🚨 session-state: Current PR review selected no merge-risk labels.
remove merge-risk: 🚨 message-delivery: Current PR review selected no merge-risk labels.
remove status: 👀 ready for maintainer look: Current PR status no longer selects a status label.

Label justifications:

rating: 🌊 off-meta tidepool: Overall readiness is 🌊 off-meta tidepool; proof is 🌊 off-meta tidepool and patch quality is 🌊 off-meta tidepool.

Evidence reviewed

PR surface:

Source +129, Tests +153. Total +282 across 10 files.

View PR surface stats

Area	Files	Added	Removed	Net
Source	6	167	38	+129
Tests	4	154	1	+153
Docs	0	0	0	0
Config	0	0	0	0
Generated	0	0	0	0
Other	0	0	0	0
Total	10	321	39	+282

What I checked:

failure reason: codex execution failed.
codex failure detail: Codex review failed for this PR with exit 1.
codex stdout: Per-item Codex failure; continuing with the rest of the shard.

Likely related people:

unknown: Codex failed before it could trace repository history. (role: review did not complete; confidence: low)

What the crustacean ranks mean

🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works

ClawSweeper keeps one durable marker-backed review comment per issue or PR.
Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
Maintainers can also comment @clawsweeper review to request a fresh review only.
Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

Fix stuck recovery aborts for active reply runs

5db41a8

openclaw-barnacle Bot added agents Agent runtime and tooling size: M triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. labels Jun 1, 2026

clawsweeper Bot added the rating: 🌊 off-meta tidepool PR readiness rating does not apply to this item. label Jun 1, 2026

oiGaDio mentioned this pull request Jun 3, 2026

[Bug]: Stuck Session Recovery 机制双重失效 + Session 预处理每次耗时过长 #76038

Open

fix(reply): avoid abort helper param shadowing

159943a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix stuck-session recovery aborts for active reply runs#88925

Fix stuck-session recovery aborts for active reply runs#88925
SnowSky1 wants to merge 2 commits into
openclaw:mainfrom
SnowSky1:codex/88870-reply-run-stuck-recovery

SnowSky1 commented Jun 1, 2026 •

edited

Loading

Uh oh!

clawsweeper Bot commented Jun 1, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

SnowSky1 commented Jun 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Testing

Real behavior proof

Notes

Uh oh!

clawsweeper Bot commented Jun 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

SnowSky1 commented Jun 1, 2026 •

edited

Loading

clawsweeper Bot commented Jun 1, 2026 •

edited

Loading