fix: release session lock before runtime teardown by fuller-stack-dev · Pull Request #87747 · openclaw/openclaw

fuller-stack-dev · 2026-05-28T18:16:17Z

Summary

Prevents embedded attempt cleanup from pinning a session transcript lock while unrelated session/plugin runtime teardown runs.

The incident this addresses was the Spartacus generated-image handoff failure on 2026-05-28:

16:58:29Z the active run hit sessions_yield abort settle timed out.
16:58:34Z generated media completion tried to wake the active requester and got source_reply_delivery_mode_mismatch.
16:59:11Z the lane reported an active session ahead, and a same-process .jsonl.lock with a long maxHoldMs was created.
17:00:12Z and 17:01:12Z requester handoff/direct delivery attempts timed out on SessionWriteLockTimeoutError against that live-process lock.

The generated media fallback PR handles delivery recovery when handoff is blocked. This PR narrows in on preventing the session from staying locked in the first place.

What changed

Release the cleanup session lock immediately after guard removal / abort-settle wait / pending tool-result flush.
Run session.dispose(), MCP runtime disposal, and LSP runtime disposal after the lock is released, so a teardown hang cannot pin the transcript lock.
Preserve existing failure semantics by still disposing resources if sessionLock.release() throws, then rethrowing that release error.
Treat sessions_yield cleanup as abort-like for the cleanup flush path, so it skips the normal idle wait after the yield abort path has already bounded abort settlement.

Real behavior proof

Behavior addressed: Embedded attempt cleanup no longer pins the session transcript write lock while unrelated session/MCP/LSP runtime teardown is still pending.

Real environment tested: Local OpenClaw source checkout at PR head 4bd7fe98ea3ba1166586c58da34a21e8c075037f on macOS with Node v24.16.0, using the production cleanupEmbeddedAttemptResources, createEmbeddedAttemptSessionLockController, and filesystem-backed acquireSessionWriteLock against a real temporary .jsonl session file.

Exact steps or command run after this patch: Ran node --import tsx --input-type=module with an inline proof script that created a real temporary session JSONL, acquired the embedded cleanup lock, started cleanup with bundleMcpRuntime.dispose() intentionally never resolving, waited until runtime teardown began, then acquired the same session file with acquireSessionWriteLock({ timeoutMs: 500 }) while teardown was still pending.

Evidence after fix: Terminal output from the after-fix proof run:

$ node --import tsx --input-type=module < inline-session-lock-proof
proof_sha=4bd7fe98ea3ba1166586c58da34a21e8c075037f
session_file=<tmp>/session.jsonl
events=flush -> cleanup_lock_release -> session_dispose -> runtime_dispose_started -> contender_lock_acquired -> contender_lock_released
contender_lock_acquired_ms=0
result=PASS cleanup released the real session write lock before runtime teardown finished; a second writer acquired the same session file lock while teardown was still pending.

Observed result after fix: The contender lock acquired immediately after runtime_dispose_started, proving cleanup had already released the real session write lock even though runtime teardown never completed.

What was not tested: I did not replay the full original Spartacus generated-image handoff with a live model/provider. This proof isolates the lock-ordering failure with production cleanup and lock primitives; the focused regression tests below cover the helper and adjacent abort-settle/session-lock behavior.

Regression proof

The new regression test proves the lock is released before a runtime teardown that never resolves:

cleanupEmbeddedAttemptResources > releases the lock before runtime teardown can hang

Supplemental focused test run after the real proof:

node scripts/run-vitest.mjs src/agents/embedded-agent-runner/run/attempt.subscription-cleanup.test.ts src/agents/embedded-agent-runner/run/attempt.session-lock.test.ts src/agents/embedded-agent-runner/run/attempt.abort-settle-timeout.test.ts --reporter=verbose

Test Files  5 passed (5)
Tests       97 passed (97)

Contributor-provided typecheck and whitespace proof:

PATH=/Users/jason/.nvm/versions/node/v24.14.0/bin:$PATH pnpm tsgo:core
# passed

git diff --check
# passed

Notes

During git commit, the repo hook printed git-hooks/pre-commit: line 41: node: command not found because the hook did not inherit the Node 24 PATH. The commit still completed, and the explicit verification above used Node 24.

clawsweeper · 2026-05-28T18:18:16Z

Codex review: passed. Reviewed May 28, 2026, 5:39 PM ET / 21:39 UTC.

Summary
The PR reorders embedded attempt cleanup to release the session write lock before session/MCP/LSP teardown, treats sessions_yield cleanup as abort-like for flush timing, and adds focused regression tests.

PR surface: Source +14, Tests +71. Total +85 across 3 files.

Reproducibility: yes. Source inspection shows current main releases the cleanup lock only after runtime teardown, so a never-resolving runtime dispose can pin the lock; the PR body’s terminal proof exercises the same ordering with production cleanup and filesystem lock primitives.

Review metrics: none identified.

Merge readiness
Overall: 🦞 diamond lobster
Proof: 🦞 diamond lobster
Patch quality: 🦞 diamond lobster
Result: ready for maintainer review.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Next step before merge

[P2] No repair lane is needed; this automerge-opted PR appears correct and should be handled by normal mergeability and required-check gates.

Security
Cleared: The diff only reorders in-process cleanup and adds tests; it does not change dependencies, CI, package metadata, secrets, or external code execution paths.

Review details

Best possible solution:

Land this narrow lock-ordering fix once required checks and mergeability pass, while keeping the generated-media fallback as the separate recovery path.

Do we have a high-confidence way to reproduce the issue?

Yes. Source inspection shows current main releases the cleanup lock only after runtime teardown, so a never-resolving runtime dispose can pin the lock; the PR body’s terminal proof exercises the same ordering with production cleanup and filesystem lock primitives.

Is this the best way to solve the issue?

Yes. Releasing the cleanup write lock before unrelated session/MCP/LSP teardown is the narrow maintainable fix, and the companion generated-media fallback keeps delivery recovery separate.

AGENTS.md: found and applied where relevant.

Codex review notes: model gpt-5.5, reasoning high; reviewed against b05aefa3cf1d.

Label changes

Label changes:

add rating: 🦞 diamond lobster: Overall readiness is 🦞 diamond lobster; proof is 🦞 diamond lobster and patch quality is 🦞 diamond lobster.
add status: 🚀 automerge armed: This PR is in ClawSweeper's automerge lane. Sufficient (terminal): The PR body supplies after-fix terminal output from production cleanup and filesystem-backed session-lock primitives showing a second writer acquiring the lock while runtime teardown remains pending.
remove rating: 🐚 platinum hermit: Current PR rating is rating: 🦞 diamond lobster, so this older rating label is no longer current.
remove status: 👀 ready for maintainer look: Current PR status label is status: 🚀 automerge armed.

Label justifications:

P1: A pinned session transcript lock can block active agent cleanup and generated-media handoff delivery for real users.
rating: 🦞 diamond lobster: Overall readiness is 🦞 diamond lobster; proof is 🦞 diamond lobster and patch quality is 🦞 diamond lobster.
status: 🚀 automerge armed: This PR is in ClawSweeper's automerge lane. Sufficient (terminal): The PR body supplies after-fix terminal output from production cleanup and filesystem-backed session-lock primitives showing a second writer acquiring the lock while runtime teardown remains pending.
proof: sufficient: Contributor real behavior proof is sufficient. The PR body supplies after-fix terminal output from production cleanup and filesystem-backed session-lock primitives showing a second writer acquiring the lock while runtime teardown remains pending.

Evidence reviewed

PR surface:

Source +14, Tests +71. Total +85 across 3 files.

View PR surface stats

Area	Files	Added	Removed	Net
Source	2	32	18	+14
Tests	1	76	5	+71
Docs	0	0	0	0
Config	0	0	0	0
Generated	0	0	0	0
Other	0	0	0	0
Total	3	108	23	+85

What I checked:

Repository policy read: Root and scoped AGENTS.md guidance for agents and embedded-runner tests was read and applied; the touched surface is core agents/session cleanup, so focused helper tests are appropriate for this lock-ordering behavior. (AGENTS.md:1)
Current main still holds the lock through teardown: On current main, cleanup disposes the session and MCP/LSP runtimes before the finally block releases sessionLock, so a hanging runtime dispose can keep the transcript write lock pinned. (src/agents/embedded-agent-runner/run/attempt.subscription-cleanup.ts:100, b05aefa3cf1d)
PR releases before teardown: At PR head, sessionLock.release runs in the cleanup finally block before session.dispose and bundle runtime disposal, and release errors are rethrown after best-effort teardown. (src/agents/embedded-agent-runner/run/attempt.subscription-cleanup.ts:101, 178192fa0ef9)
sessions_yield cleanup is covered: The PR records when prompt abort came from sessions_yield and passes that into cleanup as abort-like so the flush path skips the normal idle wait after yield abort settlement has already been bounded. (src/agents/embedded-agent-runner/run/attempt.ts:4767, 178192fa0ef9)
Regression tests cover the ordering: The new tests assert lock release before a never-resolving runtime dispose and assert resources are still disposed when lock release fails. (src/agents/embedded-agent-runner/run/attempt.subscription-cleanup.test.ts:99, 178192fa0ef9)
Real behavior proof supplied: The PR body includes after-fix terminal output from a local OpenClaw checkout using production cleanup helpers and filesystem-backed acquireSessionWriteLock, showing a contender acquired the same session file lock while runtime teardown remained pending. (178192fa0ef9)

Likely related people:

steipete: Blame and -S history point the current embedded cleanup, acquireForCleanup call, and session-lock helper surface to commit e85231d by Peter Steinberger. (role: introduced current cleanup/session-lock surface; confidence: high; commits: e85231d63d49; files: src/agents/embedded-agent-runner/run/attempt.subscription-cleanup.ts, src/agents/embedded-agent-runner/run/attempt.ts, src/agents/embedded-agent-runner/run/attempt.session-lock.ts)
ooiuuii: Recent current-main work in f49a3e4 modified the embedded attempt runner around prompt/tool-result cleanup behavior. (role: recent adjacent area contributor; confidence: medium; commits: f49a3e4c266c; files: src/agents/embedded-agent-runner/run/attempt.ts)
fuller-stack-dev: The same contributor authored the already-merged companion generated-media handoff-lock fallback that this PR references as the delivery recovery half of the incident. (role: recent adjacent fix author; confidence: medium; commits: 8a007c987d44, 4bd7fe98ea3b; files: src/agents/subagent-announce-delivery.ts, src/agents/embedded-agent-runner/run/attempt.subscription-cleanup.ts)

What the crustacean ranks mean

🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works

ClawSweeper keeps one durable marker-backed review comment per issue or PR.
Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
Maintainers can also comment @clawsweeper review to request a fresh review only.
Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

fuller-stack-dev · 2026-05-28T19:35:05Z

@clawsweeper re-review

clawsweeper · 2026-05-28T19:35:08Z

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

State: Complete
Detail: The targeted re-review finished, the durable review comment was updated, and the synced verdict was routed.
Run: https://github.com/openclaw/clawsweeper/actions/runs/26597626964
Updated: 2026-05-28T19:42:26.633Z

Takhoffman · 2026-05-28T21:33:32Z

@clawsweeper automerge

clawsweeper · 2026-05-28T21:33:35Z

🦞✅
ClawSweeper merged this PR after the passing review.

Source: clawsweeper[bot]
Feedback: structured ClawSweeper verdict: pass (sha=178192fa0ef99c891f620da75d96debef5a6d873)
Merge status: merged by ClawSweeper automerge
Merged at: 2026-05-28T21:40:06Z
Merge commit: 0dbdaf98ea74

What merged:

The PR reorders embedded attempt cleanup to release the session write lock before session/MCP/LSP teardown, treats sessions_yield cleanup as abort-like for flush timing, and adds focused regression tests.
PR surface: Source +14, Tests +71. Total +85 across 3 files.
Reproducibility: yes. Source inspection shows current main releases the cleanup lock only after runtime tear ... R body’s terminal proof exercises the same ordering with production cleanup and filesystem lock primitives.

Automerge notes:

PR branch already contained follow-up commit before automerge: Merge branch 'main' into fix/session-lock-release-before-teardown

The automerge loop is complete.

Automerge progress:

2026-05-28 21:34:09 UTC review queued 178192fa0ef9 (queued)

2026-05-28 21:39:55 UTC review passed 178192fa0ef9 (structured ClawSweeper verdict: pass (sha=178192fa0ef99c891f620da75d96debef5a6d...)

2026-05-28 21:40:09 UTC merged 178192fa0ef9 (merged by ClawSweeper automerge)

Re-review progress:

State: Complete
Detail: The targeted re-review finished, the durable review comment was updated, and the synced verdict was routed.
Run: https://github.com/openclaw/clawsweeper/actions/runs/26603603690
Updated: 2026-05-28T21:40:11.315Z

Summary: - The PR reorders embedded attempt cleanup to release the session write lock before session/MCP/LSP teardown, treats sessions_yield cleanup as abort-like for flush timing, and adds focused regression tests. - PR surface: Source +14, Tests +71. Total +85 across 3 files. - Reproducibility: yes. Source inspection shows current main releases the cleanup lock only after runtime tear ... R body’s terminal proof exercises the same ordering with production cleanup and filesystem lock primitives. Automerge notes: - PR branch already contained follow-up commit before automerge: Merge branch 'main' into fix/session-lock-release-before-teardown Validation: - ClawSweeper review passed for head 178192f. - Required merge gates passed before the squash merge. Prepared head SHA: 178192f Review: openclaw#87747 (comment) Co-authored-by: fuller-stack-dev <263060202+fuller-stack-dev@users.noreply.github.com> Co-authored-by: Jason (Json) <263060202+fuller-stack-dev@users.noreply.github.com> Co-authored-by: clawsweeper[bot] <274271284+clawsweeper[bot]@users.noreply.github.com> Approved-by: takhoffman Co-authored-by: takhoffman <781889+takhoffman@users.noreply.github.com>

Summary: - The PR reorders embedded attempt cleanup to release the session write lock before session/MCP/LSP teardown, treats sessions_yield cleanup as abort-like for flush timing, and adds focused regression tests. - PR surface: Source +14, Tests +71. Total +85 across 3 files. - Reproducibility: yes. Source inspection shows current main releases the cleanup lock only after runtime tear ... R body’s terminal proof exercises the same ordering with production cleanup and filesystem lock primitives. Automerge notes: - PR branch already contained follow-up commit before automerge: Merge branch 'main' into fix/session-lock-release-before-teardown Validation: - ClawSweeper review passed for head 178192f. - Required merge gates passed before the squash merge. Prepared head SHA: 178192f Review: #87747 (comment) Co-authored-by: fuller-stack-dev <263060202+fuller-stack-dev@users.noreply.github.com> Co-authored-by: Jason (Json) <263060202+fuller-stack-dev@users.noreply.github.com> Co-authored-by: clawsweeper[bot] <274271284+clawsweeper[bot]@users.noreply.github.com> Approved-by: takhoffman Co-authored-by: takhoffman <781889+takhoffman@users.noreply.github.com>

…026.5.28) (#759) This PR contains the following updates: | Package | Update | Change | |---|---|---| | [ghcr.io/openclaw/openclaw](https://openclaw.ai) ([source](https://github.com/openclaw/openclaw)) | patch | `2026.5.27` → `2026.5.28` | --- ### Release Notes <details> <summary>openclaw/openclaw (ghcr.io/openclaw/openclaw)</summary> ### [`v2026.5.28`](https://github.com/openclaw/openclaw/blob/HEAD/CHANGELOG.md#2026528) [Compare Source](openclaw/openclaw@v2026.5.27...v2026.5.28) ##### Highlights - Agent and Codex runtime recovery is steadier: subagents keep cwd/workspace separation, hook context stays prompt-local, session locks release on timeout abort while live OpenClaw locks survive cleanup, stale restart continuations are avoided, and Codex app-server/helper failures no longer tear down shared runtime state. ([#87218](openclaw/openclaw#87218), [#86875](openclaw/openclaw#86875), [#87409](openclaw/openclaw#87409), [#87399](openclaw/openclaw#87399), [#87375](openclaw/openclaw#87375), [#88129](openclaw/openclaw#88129)) - Channel delivery and session identity got safer across outbound plugin hooks, Matrix room ids, iMessage reactions/approvals, Slack final replies, Discord recovered tool warnings, runtime-config message actions, WhatsApp profile auth roots, Telegram polling, and Microsoft Teams service URL trust checks. ([#73706](openclaw/openclaw#73706), [#75670](openclaw/openclaw#75670), [#87366](openclaw/openclaw#87366), [#87451](openclaw/openclaw#87451), [#87334](openclaw/openclaw#87334), [#84535](openclaw/openclaw#84535), [#82492](openclaw/openclaw#82492), [#83304](openclaw/openclaw#83304), [#87160](openclaw/openclaw#87160)) - Mobile and chat surfaces got a broader refresh: the iOS Pro UI, hosted push relay default, realtime Talk tab playback, Gateway chat transport, onboarding, Talk permissions, WebChat reconnect delivery, and session picker behavior now preserve more state across reconnects and empty searches. ([#87367](openclaw/openclaw#87367), [#87531](openclaw/openclaw#87531), [#87682](openclaw/openclaw#87682), [#88096](openclaw/openclaw#88096), [#88105](openclaw/openclaw#88105)) Thanks [@ngutman](https://github.com/ngutman) and [@BunsDev](https://github.com/BunsDev). - Browser, channel, and automation inputs are stricter: Browser tool timeouts, viewport/tab indices, Gateway ports, cron retry handling, Discord component ids, schema array refs, Telegram callback pages, and channel progress callbacks now reject malformed values earlier and preserve the intended delivery context. ([#82887](openclaw/openclaw#82887)) - Provider, media, and document coverage expands with Claude Opus 4.8, Fal Krea image schemas, NVIDIA featured models, MiniMax streaming music responses, encrypted PDF extraction, voice model catalogs, GitHub Copilot agent runtime support, and a Codex Supervisor plugin path for delegated Codex workflows. ([#87845](openclaw/openclaw#87845), [#87890](openclaw/openclaw#87890), [#80775](openclaw/openclaw#80775), [#84764](openclaw/openclaw#84764), [#87751](openclaw/openclaw#87751), [#87794](openclaw/openclaw#87794)) - CLI, auth, doctor, and provider paths fail faster and recover more clearly: malformed numeric/version options are rejected, workspace dotenv provider credentials are ignored, heartbeat defaults, OAuth/token lifetimes, and local service startup requests are bounded, agent auth health labels are clearer, legacy `api_key` auth profiles migrate to canonical form, and restart guidance is actionable. ([#87398](openclaw/openclaw#87398), [#86281](openclaw/openclaw#86281), [#87361](openclaw/openclaw#87361), [#88133](openclaw/openclaw#88133), [#83655](openclaw/openclaw#83655), [#87559](openclaw/openclaw#87559), [#88088](openclaw/openclaw#88088), [#85924](openclaw/openclaw#85924)) Thanks [@vincentkoc](https://github.com/vincentkoc) and [@giodl73-repo](https://github.com/giodl73-repo). - Plugin and Gateway hot paths do less repeated work while preserving cache correctness for install records, config JSON parsing, tool search catalogs, session stores, manifest model rows, auto-enabled plugin config, browser tokens, viewer assets, and release-split external plugin packages. ([#86699](openclaw/openclaw#86699)) - Release, QA, and E2E validation now bound more log, artifact, harness, and cross-OS waits so failing lanes produce proof instead of hanging or false-greening. ##### Changes - Status: show active subagent details in status output. - Diffs: split the default language pack and expand default Diffs language coverage while keeping the host floor aligned. ([#87370](openclaw/openclaw#87370), [#87372](openclaw/openclaw#87372)) Thanks [@RomneyDa](https://github.com/RomneyDa). - ClawHub: add plugin display names plus skill verification and trust surfaces. ([#87354](openclaw/openclaw#87354), [#86699](openclaw/openclaw#86699)) Thanks [@thewilloftheshadow](https://github.com/thewilloftheshadow) and [@Patrick-Erichsen](https://github.com/Patrick-Erichsen). - iOS: refresh the dev app with Pro Command, Chat, Agents, Settings, hosted push relay defaults, and realtime Talk playback wired to gateway sessions, diagnostics, chat, and realtime Talk. ([#87367](openclaw/openclaw#87367), [#88096](openclaw/openclaw#88096), [#88105](openclaw/openclaw#88105)) Thanks [@Solvely-Colin](https://github.com/Solvely-Colin) and [@ngutman](https://github.com/ngutman). - Docs: clarify Codex computer-use setup, paste-token stdin auth setup, macOS gateway sleep troubleshooting, native Codex hook relay recovery, container model auth, install deployment cards, device-token admin gating, CLI setup flow compatibility, Notte cloud browser CDP setup, and backport targets. ([#87313](openclaw/openclaw#87313), [#63050](openclaw/openclaw#63050), [#87685](openclaw/openclaw#87685)) Thanks [@bdjben](https://github.com/bdjben), [@liaoandi](https://github.com/liaoandi), and [@thewilloftheshadow](https://github.com/thewilloftheshadow). - PDF/tools: use ClawPDF for PDF extraction, support encrypted PDF extraction, and surface MCP structured content in agent tool results. ([#87670](openclaw/openclaw#87670), [#87751](openclaw/openclaw#87751)) - Providers: add Claude Opus 4.8 support, Fal Krea image model schemas, NVIDIA featured model catalogs, MiniMax streaming music responses, and provider-backed voice model catalogs. ([#87845](openclaw/openclaw#87845), [#87890](openclaw/openclaw#87890), [#80775](openclaw/openclaw#80775), [#84764](openclaw/openclaw#84764), [#87794](openclaw/openclaw#87794)) Thanks [@eleqtrizit](https://github.com/eleqtrizit) and [@vincentkoc](https://github.com/vincentkoc). - Codex/GitHub: add the GitHub Copilot agent runtime and the Codex Supervisor plugin package. - Plugins: externalize GitHub Copilot and Tokenjuice as official install-on-demand plugins with npm and ClawHub publish metadata. - Workboard: add agent coordination tools for tracking and handing off active agent work. - Discord: show commentary in progress drafts so live Discord runs expose useful in-progress context. ([#85200](openclaw/openclaw#85200)) - Plugin SDK: add a reply payload sending hook for plugins that need to deliver channel-owned replies and flatten package types for SDK declarations. ([#82823](openclaw/openclaw#82823), [#87165](openclaw/openclaw#87165)) Thanks [@piersonr](https://github.com/piersonr) and [@RomneyDa](https://github.com/RomneyDa). - Policy: add policy comparison, ingress-channel conformance, and sandbox-posture conformance checks. ([#85572](openclaw/openclaw#85572), [#85744](openclaw/openclaw#85744), [#86768](openclaw/openclaw#86768)) ##### Fixes - Agents: fall back to local config pruning when the optional `agents delete` Gateway probe cannot authenticate, so offline installs can still delete agents without removing shared workspaces. - Tighten phone-control mutation authorization \[AI]. ([#87150](openclaw/openclaw#87150)) Thanks [@pgondhi987](https://github.com/pgondhi987). - Clarify directive persistence authorization policy \[AI]. ([#86369](openclaw/openclaw#86369)) Thanks [@pgondhi987](https://github.com/pgondhi987). - Agents/Codex: keep spawned agent cwd/workspace state separated, forward ACP spawn attachments, keep hook context prompt-local, release session locks on timeout abort and runtime teardown without deleting live OpenClaw-owned locks during cleanup, avoid session event queue self-wait, clean up exec abort listeners, stream assistant deltas incrementally, recover raw missing-thread compaction failures, preserve rotated compaction session identity, keep compaction-timeout snapshots continuable, preserve shared app-server state across startup or helper failures, keep native hook relay alive across restarts and prune stale bridge files, close native hook relay replacement races, keep Claude live tool progress visible for watchdog recovery, suppress abandoned requester completion handoff, route workspace memory through tools, resolve Codex runtime models first, report quarantined dynamic tools, format `skills` command output, bind node auto-review to prepared plans, retry Claude CLI transcript probes, and bound compaction/steering retries. ([#87218](openclaw/openclaw#87218), [#86875](openclaw/openclaw#86875), [#86123](openclaw/openclaw#86123), [#88129](openclaw/openclaw#88129), [#87399](openclaw/openclaw#87399), [#87375](openclaw/openclaw#87375), [#72574](openclaw/openclaw#72574), [#87383](openclaw/openclaw#87383), [#87400](openclaw/openclaw#87400), [#83022](openclaw/openclaw#83022), [#87671](openclaw/openclaw#87671), [#87738](openclaw/openclaw#87738), [#87747](openclaw/openclaw#87747), [#87706](openclaw/openclaw#87706), [#87546](openclaw/openclaw#87546), [#87541](openclaw/openclaw#87541), [#81048](openclaw/openclaw#81048)) Thanks [@mbelinky](https://github.com/mbelinky), [@Alix-007](https://github.com/Alix-007), [@luoyanglang](https://github.com/luoyanglang), [@yetval](https://github.com/yetval), [@sjf](https://github.com/sjf), [@joshavant](https://github.com/joshavant), [@benjamin1492](https://github.com/benjamin1492), [@c19354837](https://github.com/c19354837), [@fuller-stack-dev](https://github.com/fuller-stack-dev), [@pfrederiksen](https://github.com/pfrederiksen), and [@dodge1218](https://github.com/dodge1218). - Codex Supervisor: keep real-home app-server MCP session listing on the loaded state path, bound stored history scans, and close WebSocket probes cleanly. - Channels: thread canonical session keys into outbound hooks, preserve Matrix room-id case, keep fallback tool warnings mention-inert, retain delivered Slack final replies during late cleanup, continue iMessage polling after denied reactions, suppress duplicate native exec approvals, resolve Gateway message actions against the active runtime config, preserve Telegram SecretRef prompt config and polling keepalives, preserve WhatsApp profile auth roots, QR display, document filenames, and plugin hook config, suppress Discord recovered tool warnings, preserve the Discord voice outbound helper, cap Discord/Signal/Zalo channel request and container timeouts, and block untrusted Teams service URLs while keeping TeamsSDK patterns aligned. ([#73706](openclaw/openclaw#73706), [#75670](openclaw/openclaw#75670), [#87366](openclaw/openclaw#87366), [#87451](openclaw/openclaw#87451), [#87465](openclaw/openclaw#87465), [#87334](openclaw/openclaw#87334), [#84535](openclaw/openclaw#84535), [#76262](openclaw/openclaw#76262), [#83304](openclaw/openclaw#83304), [#82492](openclaw/openclaw#82492), [#87581](openclaw/openclaw#87581), [#77114](openclaw/openclaw#77114), [#86426](openclaw/openclaw#86426), [#85529](openclaw/openclaw#85529), [#87160](openclaw/openclaw#87160)) Thanks [@zeroaltitude](https://github.com/zeroaltitude), [@lukeboyett](https://github.com/lukeboyett), [@jarvis-mns1](https://github.com/jarvis-mns1), [@xiaotian](https://github.com/xiaotian), [@funmerlin](https://github.com/funmerlin), [@joshavant](https://github.com/joshavant), [@eleqtrizit](https://github.com/eleqtrizit), [@heyitsaamir](https://github.com/heyitsaamir), [@amittell](https://github.com/amittell), [@lidge-jun](https://github.com/lidge-jun), [@liorb-mountapps](https://github.com/liorb-mountapps), [@masatohoshino](https://github.com/masatohoshino), [@bladin](https://github.com/bladin), and [@giodl73-repo](https://github.com/giodl73-repo). - CLI/auth/doctor/providers: reject malformed numeric/timeout/subcommand-version inputs, ignore workspace dotenv provider credentials, wait for respawn child shutdown, bound heartbeat defaults plus Codex, GitHub Copilot, OpenAI, Anthropic, Google, Feishu, LM Studio, MiniMax, Xiaomi TTS, and local-provider OAuth/token/model requests, harden Codex auth probes, label auth health by agent, preserve explicit agentRuntime pins during Codex model migration, warm provider auth off the main thread, honor Codex response timeouts, stop migrating current Claude Haiku 4.5 profiles to Sonnet, bound local service startup, resolve GPT-5.5 without cached catalog, migrate legacy memory auto-provider config, rewrite non-canonical `api_key` auth profiles, and make doctor restart follow-ups actionable. ([#87398](openclaw/openclaw#87398), [#86281](openclaw/openclaw#86281), [#87361](openclaw/openclaw#87361), [#88133](openclaw/openclaw#88133), [#83655](openclaw/openclaw#83655), [#87559](openclaw/openclaw#87559), [#87719](openclaw/openclaw#87719), [#88088](openclaw/openclaw#88088), [#85924](openclaw/openclaw#85924), [#84362](openclaw/openclaw#84362)) Thanks [@Patrick-Erichsen](https://github.com/Patrick-Erichsen), [@samzong](https://github.com/samzong), [@giodl73-repo](https://github.com/giodl73-repo), [@alkor2000](https://github.com/alkor2000), [@mmaps](https://github.com/mmaps), [@nxmxbbd](https://github.com/nxmxbbd), and [@vincentkoc](https://github.com/vincentkoc). - Gateway/security/session state: expire browser tokens after auth rotation, scope assistant idempotency dedupe, drain probe client closes, avoid stale restart continuation reuse, preserve retry-after fallbacks and stale rate-limit cooldown probes, bound webchat image and artifact transcript scans, include seconds in inbound metadata timestamps, clear completed session active runs, clear stale chat stream buffers, and evict current plugin-state namespaces at row caps. ([#87810](openclaw/openclaw#87810), [#87833](openclaw/openclaw#87833), [#75089](openclaw/openclaw#75089)) Thanks [@joshavant](https://github.com/joshavant) and [@litang9](https://github.com/litang9). - Config/parsing/network: reject partial numeric parsing, parse provider/Discord retry headers and dates strictly, honor IPv6 and bare IPv6 `no_proxy` entries, preserve empty plugin allowlists, canonicalize secret target array indexes, and reject malformed media content lengths, inspected TCP ports, marketplace content lengths, cron epochs, sandbox stat fields, unsafe duration values, empty config path segments, noncanonical schema array refs, unsafe Telegram callback pages, and invalid Teams attachment-fetch DNS targets. ([#87883](openclaw/openclaw#87883)) Thanks [@zhangguiping-xydt](https://github.com/zhangguiping-xydt). - Browser/input hardening: reject invalid tab indexes, excessive viewport resizes, explicit zero CDP ports, malformed geolocation options, unsafe screenshot or permission-grant timeouts, loose response-body limits, invalid cookie expiries, and non-finite Browser tool delays/timeouts. - Cron/automation: retry recurring jobs after transient model rate limits before waiting for the next scheduled slot, and preflight model fallbacks before skipping scheduled work. ([#82887](openclaw/openclaw#82887)) Thanks [@chen-zhang-cs-code](https://github.com/chen-zhang-cs-code). - Auto-reply/directives: respect provider and relayed channel metadata during directive persistence so channel-originated decisions keep their intended context. ([#87683](openclaw/openclaw#87683)) - WhatsApp: resolve the auth directory from the active profile so profile-scoped WhatsApp installs do not drift to the wrong credential root. ([#82492](openclaw/openclaw#82492)) Thanks [@lidge-jun](https://github.com/lidge-jun). - Gateway/session state: clear completed session active runs, avoid cold-loading providers for MCP inventory, cache single-session child indexes, cap handshake timers, and bound preauth, auth-guard, media, transcript, readiness, and port options. - Channels/replies: preserve channel-owned progress callbacks when verbose output is off, keep group-room progress suppression intact, prefer external session delivery context, escape Discord component id delimiters, force final TUI chat repaints, show Slack reasoning previews, and normalize Discord/Matrix/Mattermost channel numeric options. ([#87476](openclaw/openclaw#87476), [#87423](openclaw/openclaw#87423)) - Agents/tool args: harden smart-quoted argument repair for edit arrays and exact escaped arguments so model-produced tool calls recover without corrupting valid input. ([#86611](openclaw/openclaw#86611)) Thanks [@ferminquant](https://github.com/ferminquant). - Providers/agents: preserve seeded Anthropic signatures, preserve signed thinking payloads, concatenate signature-delta chunks, preserve DeepSeek `reasoning_content` replay across tier suffixes, apply OpenRouter strict9 ids to Mistral routes, promote Ollama plain-text tool calls, load NVIDIA featured model catalogs, stream MiniMax music generation responses, and recover empty preflight compaction. ([#87593](openclaw/openclaw#87593), [#87493](openclaw/openclaw#87493), [#80775](openclaw/openclaw#80775), [#84764](openclaw/openclaw#84764)) Thanks [@Pluviobyte](https://github.com/Pluviobyte) and [@eleqtrizit](https://github.com/eleqtrizit). - Media/images: skip CLI image cache refs when resolving generated images, allow trusted generated HTML attachments, and bound generated video downloads so stale refs and slow providers fail cleanly. ([#87523](openclaw/openclaw#87523), [#87982](openclaw/openclaw#87982)) - File transfer: handle late tar stdin pipe errors after archive validation or unpacking has already settled. - Performance: trust install-record caches between reloads, prefer native JSON parsing, reuse unchanged tool-search catalogs, reuse gateway session and plugin metadata paths, skip unchanged store serialization, patch single-entry session writes, add precomputed session patch writers, reduce store clone allocations, cache manifest model catalog rows and auto-enabled plugin config, avoid full session snapshots for entry reads, defer configured Slack full startup, prefer bundled plugin dist entries, and slim current metadata identity caches. ([#87760](openclaw/openclaw#87760)) - Docker/release/QA: package runtime workspace templates, stream cross-OS served artifacts, preserve sparse Crabbox run artifacts, isolate npm plugin installs per package, reject incompatible package plugin API installs, drop the leftover root Sharp dependency from package manifests after the Rastermill migration, bound OpenClaw instance logs, plugin gauntlet relay logs, MCP channel buffers, kitchen-sink scans, agent-turn assertions, QA-Lab credential broker calls, QA Matrix substrate requests, and release scenario logs, and keep release/google live guards current. ([#87647](openclaw/openclaw#87647), [#87477](openclaw/openclaw#87477)) Thanks [@rohitjavvadi](https://github.com/rohitjavvadi) and [@vincentkoc](https://github.com/vincentkoc). - Release/CI: bound manual git fetches, ClawHub verifier responses, ClawHub owner metadata, dependency-guard error bodies, Parallels limits, startup/test/memory budget parsing, and diffs viewer build warnings so release lanes fail with useful proof instead of hanging. ([#87839](openclaw/openclaw#87839)) </details> --- ### Configuration 📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about these updates again. --- - [ ] If you want to rebase/retry this PR, check this box --- This PR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate).  Reviewed-on: https://git.erwanleboucher.dev/eleboucher/homelab/pulls/759

Summary: - The PR reorders embedded attempt cleanup to release the session write lock before session/MCP/LSP teardown, treats sessions_yield cleanup as abort-like for flush timing, and adds focused regression tests. - PR surface: Source +14, Tests +71. Total +85 across 3 files. - Reproducibility: yes. Source inspection shows current main releases the cleanup lock only after runtime tear ... R body’s terminal proof exercises the same ordering with production cleanup and filesystem lock primitives. Automerge notes: - PR branch already contained follow-up commit before automerge: Merge branch 'main' into fix/session-lock-release-before-teardown Validation: - ClawSweeper review passed for head 178192f. - Required merge gates passed before the squash merge. Prepared head SHA: 178192f Review: openclaw#87747 (comment) Co-authored-by: fuller-stack-dev <263060202+fuller-stack-dev@users.noreply.github.com> Co-authored-by: Jason (Json) <263060202+fuller-stack-dev@users.noreply.github.com> Co-authored-by: clawsweeper[bot] <274271284+clawsweeper[bot]@users.noreply.github.com> Approved-by: takhoffman Co-authored-by: takhoffman <781889+takhoffman@users.noreply.github.com>

fix: release session lock before runtime teardown

4bd7fe9

openclaw-barnacle Bot added agents Agent runtime and tooling size: S triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. labels May 28, 2026

openclaw-barnacle Bot added proof: supplied External PR includes structured after-fix real behavior proof. and removed triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. labels May 28, 2026

Merge branch 'main' into fix/session-lock-release-before-teardown

178192f

openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 28, 2026

clawsweeper Bot mentioned this pull request May 28, 2026

fix: fallback when generated media handoff locks #87741

Merged

clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 28, 2026

clawsweeper Bot merged commit 0dbdaf9 into openclaw:main May 28, 2026
136 of 140 checks passed

github-actions Bot mentioned this pull request May 28, 2026

📡 Upstream Digest — 2026-05-28 21:51 UTC curtismercier/openclaw-mods#966

Open

RomneyDa mentioned this pull request Jun 1, 2026

Unclosed FileHandle on session JSONL lock crashes gateway on Node ≥24 under sustained session-store load #84820

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: release session lock before runtime teardown#87747

fix: release session lock before runtime teardown#87747
clawsweeper[bot] merged 2 commits into
openclaw:mainfrom
fuller-stack-dev:fix/session-lock-release-before-teardown

fuller-stack-dev commented May 28, 2026 •

edited

Loading

Uh oh!

clawsweeper Bot commented May 28, 2026 •

edited

Loading

Uh oh!

fuller-stack-dev commented May 28, 2026

Uh oh!

clawsweeper Bot commented May 28, 2026 •

edited

Loading

Uh oh!

Takhoffman commented May 28, 2026

Uh oh!

clawsweeper Bot commented May 28, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

fuller-stack-dev commented May 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What changed

Real behavior proof

Regression proof

Notes

Uh oh!

clawsweeper Bot commented May 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fuller-stack-dev commented May 28, 2026

Uh oh!

clawsweeper Bot commented May 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Takhoffman commented May 28, 2026

Uh oh!

clawsweeper Bot commented May 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fuller-stack-dev commented May 28, 2026 •

edited

Loading

clawsweeper Bot commented May 28, 2026 •

edited

Loading

clawsweeper Bot commented May 28, 2026 •

edited

Loading

clawsweeper Bot commented May 28, 2026 •

edited

Loading