fix(embedded-runner): preserve provider errors on cleanup takeover by clawsweeper[bot] · Pull Request #84321 · openclaw/openclaw

clawsweeper · 2026-05-19T21:11:08Z

Makes #84056 merge-ready for the ClawSweeper automerge loop.
The edit pass should inspect the live PR diff, review comments, and failing checks; rebase if needed; keep the contributor branch credited; and stop only when validation is green or an external blocker is proven.

ClawSweeper 🐠 replacement reef notes:

Cluster: automerge-openclaw-openclaw-84056
Source PRs: fix(embedded-runner): preserve provider errors on cleanup takeover #84056
Credit: Source PR: fix(embedded-runner): preserve provider errors on cleanup takeover #84056
Validation: pnpm check:changed
Replacement reason: ClawSweeper could not update the source PR branch directly, so it opened a writable replacement PR instead.
Automerge requested by: @Takhoffman

Repair fallback: GitHub rejected the repair branch push because it updates workflow files and the ClawSweeper app token does not have workflows permission

Co-author credit kept:

@abnershang: Co-authored-by: Abner Shang 75654486+abnershang@users.noreply.github.com

fish notes: model gpt-5.5, reasoning high; reviewed against e7d9d8c.

clawsweeper · 2026-05-19T21:12:21Z

Codex review: passed. Reviewed May 25, 2026, 11:05 PM ET / 03:05 UTC.

Summary
The PR preserves provider-facing embedded-runner prompt errors when cleanup detects session takeover, keeps the takeover signal fatal for fallback, and adds focused regressions.

PR surface: Source +52, Tests +92. Total +144 across 5 files.

Reproducibility: yes. Source inspection shows current main can let cleanup takeover replace a prior prompt/provider error and can normalize a provider-looking takeover wrapper before fallback sees it as coordination failure.

Review metrics: none identified.

Merge readiness
Overall: 🐚 platinum hermit
Proof: 🦞 diamond lobster
Patch quality: 🐚 platinum hermit
Result: ready for maintainer review.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

none

Risk before merge

Merging intentionally makes takeover-marked provider errors abort model fallback, so configured fallback models will not be tried when the visible provider message also looks failoverable.
The proof is a runtime classifier probe plus focused embedded-runner/fallback harness coverage, not a forced live provider outage with an actual cleanup-race log.

Maintainer options:

Accept the fallback/session takeover contract (recommended)
Merge after required checks if maintainers agree that a local cleanup takeover should abort model fallback even when the provider-facing message looks failoverable.
Pause for live cleanup-race proof
Ask for a redacted runtime log or terminal artifact from an actual provider failure followed by cleanup takeover if boundary-level proof is not enough for this session-state path.

Next step before merge
No ClawSweeper repair lane is needed; the remaining path is exact-head merge gating and maintainer acceptance of the fallback/session-state risk.

Security
Cleared: The diff touches TypeScript agent runtime code and tests only; I found no concrete dependency, workflow, secret-handling, package, or supply-chain regression.

Review details

Best possible solution:

Land the narrow error-precedence and fallback-abort fix after exact-head merge gates if maintainers accept boundary-level proof for this internal session takeover race.

Do we have a high-confidence way to reproduce the issue?

Yes. Source inspection shows current main can let cleanup takeover replace a prior prompt/provider error and can normalize a provider-looking takeover wrapper before fallback sees it as coordination failure.

Is this the best way to solve the issue?

Yes. The patch is narrow: it preserves the provider-facing message while carrying the takeover identity/cause so fallback stops, and it covers cleanup-only takeover separately.

AGENTS.md: found and applied where relevant.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 0d23c3b4e133.

Label changes

Label changes:

remove status: 👀 ready for maintainer look: Current PR status label is status: 🚀 automerge armed.

Label justifications:

P2: This is a focused bug fix for an agent runtime fallback edge case with limited but real provider/session impact.
merge-risk: 🚨 auth-provider: The PR changes when provider/model fallback aborts instead of trying configured fallback candidates.
merge-risk: 🚨 session-state: The PR changes cleanup session-takeover precedence and preserves takeover state through the thrown error path.
rating: 🐚 platinum hermit: Overall readiness is 🐚 platinum hermit; proof is 🦞 diamond lobster and patch quality is 🐚 platinum hermit.
status: 🚀 automerge armed: This PR is in ClawSweeper's automerge lane. Sufficient (terminal): The source PR provides structured after-fix terminal output from a runtime fallback-classifier probe plus focused validation, and this replacement keeps that proof trail.
proof: sufficient: Contributor real behavior proof is sufficient. The source PR provides structured after-fix terminal output from a runtime fallback-classifier probe plus focused validation, and this replacement keeps that proof trail.

Evidence reviewed

PR surface:

Source +52, Tests +92. Total +144 across 5 files.

View PR surface stats

Area	Files	Added	Removed	Net
Source	3	56	4	+52
Tests	2	92	0	+92
Docs	0	0	0	0
Config	0	0	0	0
Generated	0	0	0	0
Other	0	0	0	0
Total	5	148	4	+144

What I checked:

Current main cleanup precedence still drops the provider error: Current main emits and rejects cleanupError ahead of promptError, so a cleanup-time session takeover can replace the earlier provider/prompt failure before the caller sees it. (src/agents/pi-embedded-runner/run/attempt.ts:5176, 0d23c3b4e133)
Current main fallback can normalize takeover-marked provider-looking errors: runFallbackCandidate currently normalizes failoverable-looking errors before a local coordination check, so a takeover wrapper with a rate-limit-looking message can become provider failover. (src/agents/model-fallback.ts:257, 0d23c3b4e133)
PR preserves the prompt error while carrying cleanup takeover: The PR synthesizes or captures cleanup takeover, emits diagnostics with the prompt error when preserving it, and throws a wrapper whose visible message is the provider error while the cause remains EmbeddedAttemptSessionTakeoverError. (src/agents/pi-embedded-runner/run/attempt.ts:5199, 050c779cfa61)
PR checks coordination before failover normalization: The patch adds an early isNonProviderRuntimeCoordinationError check before coerceToFailoverError in runFallbackCandidate, preventing fallback from consuming another model after local session takeover. (src/agents/model-fallback.ts:261, 050c779cfa61)
Focused regression coverage is present: The PR adds embedded-runner tests for prompt-error preservation and cleanup-only takeover, plus model-fallback coverage proving a takeover-carrying provider error aborts after one attempt. (src/agents/pi-embedded-runner/run/attempt.spawn-workspace.context-engine.test.ts:1317, 050c779cfa61)
Source proof and replacement context were checked: The superseded source PR contains structured after-fix terminal proof for the runtime fallback classifier plus focused Vitest, tsgo, check:changed, and diff-check commands; this bot replacement preserves that credited work. (3240d6764653)

Likely related people:

vincentkoc: Recent commits in the embedded-runner/session-fencing area and fallback/provider-resolution path make this a strong routing candidate for the session takeover and fallback boundary. (role: recent area contributor; confidence: high; commits: 2bb00f6726d4, a122d804dda8, 3c8d101f5a85; files: src/agents/pi-embedded-runner/run/attempt.ts, src/agents/pi-embedded-runner/run/attempt.session-lock.ts, src/agents/model-fallback.ts)
steipete: Recent history for model fallback and failover classification includes multiple steipete-authored changes to fallback selection and failover signal behavior. (role: fallback and failover area contributor; confidence: medium; commits: f4ba9553c029, 8c49121ec881, 936c02e22c98; files: src/agents/model-fallback.ts, src/agents/failover-error.ts)
jalehman: Recent embedded session write-lock work in the same takeover/fence area includes jalehman as a co-author or reviewer, which makes them useful for session-state contract review. (role: adjacent session-fencing reviewer/contributor; confidence: medium; commits: 1b77145687ca, cff5244a5b25; files: src/agents/pi-embedded-runner/run/attempt.ts, src/agents/pi-embedded-runner/run/attempt.session-lock.ts)

What the crustacean ranks mean

🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works

ClawSweeper keeps one durable marker-backed review comment per issue or PR.
Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
Maintainers can also comment @clawsweeper review to request a fresh review only.
Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

clawsweeper · 2026-05-19T21:16:35Z

🦞✅
ClawSweeper is pausing this repair loop for human review.

Source: clawsweeper[bot]
Reason: No repair job is needed; the remaining action is normal automerge or maintainer handling after exact-head checks and risk acceptance.; Cleared: The diff touches TypeScript agent runtime code, tests, and changelog only; I found no concrete dependency, CI, secret-handling, package, or supply-chain regression. (sha=e7d9d8cafeb2c5040e220bae5a0054a7623a0adf)

Why human review is needed:
This item has security-sensitive risk. ClawSweeper is pausing instead of making an autonomous change that could affect trust, credentials, permissions, or exposure.

Recommended next action:
Have a maintainer review the security-sensitive detail and provide an explicit safe path before asking ClawSweeper to continue.

I added clawsweeper:human-review and left the final call with a maintainer.

Takhoffman · 2026-05-25T21:20:04Z

@clawsweeper re-review

clawsweeper · 2026-05-25T21:20:07Z

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

State: Complete
Detail: The targeted re-review finished, the durable review comment was updated, and the synced verdict was routed.
Run: https://github.com/openclaw/clawsweeper/actions/runs/26420230102
Updated: 2026-05-25T21:32:31.951Z

clawsweeper · 2026-05-25T21:32:25Z

ClawSweeper PR egg

✨ Hatched: 🥚 common Gilded Patch Peep

Hatch command

Comment @clawsweeper hatch when this PR is hatchable.

Hatchability rules:

Merged PRs are hatchable.
Open PRs are hatchable when they are status: 👀 ready for maintainer look, status: 🚀 automerge armed, or labeled clawsweeper:automerge.
Closed unmerged PRs are hatchable only when one of those hatchable labels is still present in the durable record.

Rarity: 🥚 common.
Trait: sniffs out flaky tests.
Image traits: location status garden; accessory commit compass; palette plum, gold, and soft gray; mood bright-eyed; pose pointing at a small proof artifact; shell starlit enamel shell; lighting subtle sparkle highlights; background small review tokens.
Share on X: post this hatch
Copy: My PR egg hatched a 🥚 common Gilded Patch Peep in ClawSweeper.

What is this egg doing here?

Eggs appear after the PR passes real-behavior proof. It is here for vibes, not verdicts: it does not change labels, ratings, merge decisions, or automation.
The shell reacts to review momentum: open follow-up work warms it up, re-review makes it wobble, and a clean final review lets it hatch.
Hatchability usually comes from sufficient real-behavior proof, no blocking P0/P1/P2 findings, no security attention needed, and clean correctness. A merged PR is already final, so merge makes the egg hatchable independently.
The hatch is seeded from this repository and PR number, so the same PR keeps the same creature; the reviewed head SHA can only change safe visual details.
Rarity is just collectible sparkle: 🥚 common, 🌱 uncommon, 💎 rare, ✨ glimmer, and 🌈 legendary.

Takhoffman · 2026-05-26T02:43:31Z

@clawsweeper automerge

clawsweeper · 2026-05-26T02:43:33Z

🦞✅
ClawSweeper merged this PR after the passing review.

Source: clawsweeper[bot]
Feedback: structured ClawSweeper verdict: pass (sha=050c779cfa613efc14f6bc7713fcaedde27b0f7c)
Merge status: merged by ClawSweeper automerge
Merged at: 2026-05-26T03:09:27Z
Merge commit: 7fbca96a0cda

What merged:

The PR preserves provider-facing embedded-runner prompt errors when cleanup detects session takeover, keeps the takeover signal fatal for fallback, and adds focused regressions.
PR surface: Source +52, Tests +92. Total +144 across 5 files.
Reproducibility: yes. Source inspection shows current main can let cleanup takeover replace a prior prompt/p ... rror and can normalize a provider-looking takeover wrapper before fallback sees it as coordination failure.

Automerge notes:

PR branch already contained follow-up commit before automerge: fix(embedded-runner): preserve takeover during fallback
PR branch already contained follow-up commit before automerge: fix(clawsweeper): address review for automerge-openclaw-openclaw-8405…

The automerge loop is complete.

Automerge progress:

2026-05-26 02:43:31 UTC review queued e7d9d8cafeb2 (queued)

2026-05-26 02:44:21 UTC repair queued e7d9d8cafeb2 (autonomous) Run: https://github.com/openclaw/clawsweeper/actions/runs/26429268020

2026-05-26 02:47:00 UTC repair started (running) in 0s Run: https://github.com/openclaw/clawsweeper/actions/runs/26429268020 automerge-openclaw-openclaw-84056

2026-05-26 02:47:16 UTC validation plan (passed) in 16s Run: https://github.com/openclaw/clawsweeper/actions/runs/26429268020 pnpm check:changed; pnpm lint; pnpm check:test-types

2026-05-26 02:47:29 UTC Codex write preflight (passed) in 29s Run: https://github.com/openclaw/clawsweeper/actions/runs/26429268020 danger-full-access

2026-05-26 02:55:08 UTC Codex edit 1 50fa573b9802 (complete) in 8m 8s Run: https://github.com/openclaw/clawsweeper/actions/runs/26429268020 exit 0

2026-05-26 02:58:44 UTC validation and review 1 050c779cfa61 (base moved) in 11m 44s Run: https://github.com/openclaw/clawsweeper/actions/runs/26429268020 rebased

2026-05-26 02:59:12 UTC repair finished 050c779cfa61 (opened) in 12m 12s Run: https://github.com/openclaw/clawsweeper/actions/runs/26429268020 open_fix_pr

2026-05-26 03:09:15 UTC review passed 050c779cfa61 (structured ClawSweeper verdict: pass (sha=050c779cfa613efc14f6bc7713fcaedde27b0...)

2026-05-26 03:09:29 UTC merged 050c779cfa61 (merged by ClawSweeper automerge)

…6 (1)

…penclaw#84321) Summary: - The PR preserves provider-facing embedded-runner prompt errors when cleanup detects session takeover, keeps the takeover signal fatal for fallback, and adds focused regressions. - PR surface: Source +52, Tests +92. Total +144 across 5 files. - Reproducibility: yes. Source inspection shows current main can let cleanup takeover replace a prior prompt/p ... rror and can normalize a provider-looking takeover wrapper before fallback sees it as coordination failure. Automerge notes: - PR branch already contained follow-up commit before automerge: fix(embedded-runner): preserve takeover during fallback - PR branch already contained follow-up commit before automerge: fix(clawsweeper): address review for automerge-openclaw-openclaw-8405… Validation: - ClawSweeper review passed for head 050c779. - Required merge gates passed before the squash merge. Prepared head SHA: 050c779 Review: openclaw#84321 (comment) Co-authored-by: abnershang <abner.shang@gmail.com> Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com> Co-authored-by: clawsweeper[bot] <274271284+clawsweeper[bot]@users.noreply.github.com> Approved-by: takhoffman Co-authored-by: takhoffman <781889+takhoffman@users.noreply.github.com>

@jalehman

* fix(agents): skip fallback for session coordination errors Preserve provider fallback metadata when session coordination errors are nested under provider failures. Co-authored-by: luyao618 <364939526@qq.com> (cherry picked from commit 6a5a135) * fix(agents): tolerate in-process session writes during prompt release (openclaw#84250) Merged via squash. Prepared head SHA: 33f88fe Co-authored-by: tianxiaochannel-oss88 <272340815+tianxiaochannel-oss88@users.noreply.github.com> Co-authored-by: jalehman <550978+jalehman@users.noreply.github.com> Reviewed-by: @jalehman (cherry picked from commit 1b77145) * fix(agents): bound embedded compaction write locks Fixes the embedded attempt session write-lock watchdog so the fallback max hold time follows the resolved compaction timeout plus the existing lock grace window, instead of inheriting the full run timeout. Adds regression coverage for the helper and settled-compaction lock lifecycle, plus a changelog entry thanking @luoyanglang. Verification: - `pnpm test src/agents/session-write-lock.test.ts src/agents/pi-embedded-runner/run/attempt.test.ts src/agents/pi-embedded-runner/run/attempt.session-lock.test.ts` - `pnpm check:changed` via Blacksmith Testbox `tbx_01ks8b6vn8se5cg1dfn3te3g47` / https://github.com/openclaw/openclaw/actions/runs/26301988670 - Autoreview clean: `/Users/steipete/Projects/agent-scripts/skills/autoreview/scripts/autoreview --mode branch --base origin/main` - PR CI green on `79e8c5f1a637981d263c0268bf5666967ff4e778`: https://github.com/openclaw/openclaw/actions/runs/26302152844 and https://github.com/openclaw/openclaw/actions/runs/26302152798 Co-authored-by: luoyanglang <hanwanlonga@gmail.com> (cherry picked from commit 46de078) * fix(session-lock): enforce maxHoldMs in shouldReclaim during lock acquisition (openclaw#85764) * fix(session-lock): enforce maxHoldMs in shouldReclaim during lock acquisition - Adds optional maxHoldMs parameter to inspectLockPayload - Inspect now marks locks as stale when held longer than maxHoldMs - Passes maxHoldMs through inspectLockPayloadForSession - acquireSessionWriteLock's shouldReclaim callback now passes maxHoldMs This ensures that when a live process holds a lock for longer than maxHoldMs (default 5min), other processes can reclaim it during acquisition — matching the watchdog's existing enforcement. Previously shouldReclaim only used staleMs (30min default), meaning a lock held for 10+ minutes by a live PID would never be reclaimable, causing 60s timeout failures and gateway freezes. Closes openclaw#85762 * fix(session-lock): add dead-PID fast-path before retry loop Adds a fast-path check at the top of acquireSessionWriteLock: if the lock file's owner PID is dead, remove it immediately before entering the retry loop. This saves up to timeoutMs (60s) of futile waiting when the previous lock holder has died. The shouldReclaim callback already handles this case, but only iteratively through the retry loop. The fast-path eliminates that unnecessary delay. * fix(session-lock): enforce max hold during acquisition * fix(session-lock): revalidate max hold safely * fix(session-lock): honor holder max-hold policy * fix(session-lock): keep cleanup from reclaiming live holders * fix(session-lock): remove stale locks only when unchanged * fix(session-lock): skip self-held max-hold reclaim * fix(ci): refresh gateway protocol checks --------- Co-authored-by: njuboy11 <njuboy11@users.noreply.github.com> Co-authored-by: Peter Steinberger <steipete@gmail.com> (cherry picked from commit a1eb765) * fix(embedded-runner): preserve provider errors on cleanup takeover (openclaw#84321) Summary: - The PR preserves provider-facing embedded-runner prompt errors when cleanup detects session takeover, keeps the takeover signal fatal for fallback, and adds focused regressions. - PR surface: Source +52, Tests +92. Total +144 across 5 files. - Reproducibility: yes. Source inspection shows current main can let cleanup takeover replace a prior prompt/p ... rror and can normalize a provider-looking takeover wrapper before fallback sees it as coordination failure. Automerge notes: - PR branch already contained follow-up commit before automerge: fix(embedded-runner): preserve takeover during fallback - PR branch already contained follow-up commit before automerge: fix(clawsweeper): address review for automerge-openclaw-openclaw-8405… Validation: - ClawSweeper review passed for head 050c779. - Required merge gates passed before the squash merge. Prepared head SHA: 050c779 Review: openclaw#84321 (comment) Co-authored-by: abnershang <abner.shang@gmail.com> Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com> Co-authored-by: clawsweeper[bot] <274271284+clawsweeper[bot]@users.noreply.github.com> Approved-by: takhoffman Co-authored-by: takhoffman <781889+takhoffman@users.noreply.github.com> (cherry picked from commit 7fbca96) * fix(agents): release embedded-attempt session lock on every exit path (openclaw#86427) * fix(agents): release embedded-attempt session lock on every exit path The embedded run controller acquires its session write lock eagerly at creation and released it only inside the post-run cleanup block. An exception thrown in post-prompt processing skipped that block, so the lock leaked to the live gateway process until the watchdog reclaimed it and later requests to the session failed with SessionWriteLockTimeoutError. Add an idempotent dispose() to the lock controller and call it from the run's outer finally so the eagerly-held lock is released on every exit path. Normal/aborted/timed-out runs still hand the lock to acquireForCleanup first, so dispose() is a no-op then (no double release). Fixes openclaw#86014 * fix: keep session lock teardown comment lean * docs(changelog): note embedded session lock fix --------- Co-authored-by: Peter Steinberger <steipete@gmail.com> (cherry picked from commit 32ddfc2) * fix(agents): fence yield abort lock release (cherry picked from commit 0fe7479) * fix(agents): memoize session lock owner args Memoize owner process argv lookups per PID during `cleanStaleLockFiles`, and yield between lock entries so startup cleanup does not monopolize the event loop while inspecting many session locks. This keeps lock classification semantics unchanged while avoiding repeated synchronous process-args reads for lock clusters owned by the same PID, especially the Windows PowerShell path. Fixes openclaw#86509. Verification: - `git diff --check origin/main...HEAD` - focused TSX harness against the current-main merge result: `session-lock memo regression harness passed` Thanks @openperf. Co-authored-by: openperf <16864032@qq.com> (cherry picked from commit c430fcd) * fix(diagnostics): recover orphaned session activity Recover idle queued sessions whose diagnostic activity retained stale ownerless model or tool calls by classifying them as recoverable session.stuck after the usual recovery gates. Yield the event loop before stale session-lock process inspection so sync process lookup cannot monopolize lock contention paths. Docs now describe the widened session.stuck telemetry contract for recoverable stale bookkeeping, including ownerless activity. Thanks @samuelsoaress. Refs openclaw#84903. Co-authored-by: samuelsoaress <samuelsoares177778@gmail.com> (cherry picked from commit 286964c) * [FORK][openclaw#86584] gate owned-write publish on pre-append fingerprint (fixes openclaw#86572) Carries unmerged upstream PR openclaw#86584 (HEAD d79a3b4) onto the boon 5.18 base as the same-lane EmbeddedAttemptSessionTakeoverError fence fix for long cron turns. Fails closed: an external mutation before pi's append fails the trust gate and still trips the fence (verified by the PR's 303-line test suite incl. the mixed-interleave negative test). Backfills base symbols openclaw#86584 assumes (introduced upstream between 5.18 and the PR base, not carried by the 9 merged race-fix picks): - session-lock.ts: MAX_BENIGN_SESSION_FENCE_{ADVANCE,REWRITE,REWRITE_RESULT}_BYTES, MAX_SAFE_FILE_OFFSET, TRANSCRIPT_ONLY_OPENCLAW_ASSISTANT_MODELS, SessionFileFenceSnapshot type, fenceSnapshot state var, ActiveWriteLockState type + activeWriteLock store fix (reuse nested writes via {active:true}), node:util + string-normalization imports. - transcript-append.ts: wrap appendSessionTranscriptMessage in runWithOwnedSessionTranscriptWriteLock so low-level appends acquire the owned-context lock. - test import fixes (appendSessionTranscriptMessage, withOwned/bindOwned, __testing). Drop when upstream merges openclaw#86584. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * [FORK][openclaw#86584] wire owned-transcript-write context + typecheck cleanup CRITICAL: wrap promptActiveSession in withOwnedSessionTranscriptWrites and bind onBlockReply/onBlockReplyFlush to the owned context in attempt.ts. Without this, pi's own transcript appends during a prompt are NOT recorded as owned, so the fence trips on them (the exact takeover the backport is meant to prevent). This wiring is an intermediate-base feature (between 5.18 and openclaw#84250's base) the merged picks didn't carry. Tests passed before only because they set the context manually. Also: add releaseHeldLockForAbort to the controller type; drop incidental non-fence suppressAssistantErrorPersistence passes; remove dead async benign-rewrite cluster (sessionFence{Advance,Rewrite}IsBenign + readAppendedSessionFileText + lineMatchesLinearTranscriptMigration + helpers) — our openclaw#84250-based assertSessionFileFence uses the sync owned-write path, so the async benign-detection variants are unreachable. tsgo core: 0 errors. 384 tests pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * [FORK][openclaw#86584] address codex review: prefix-validate benign advance + preserve provider error Finding 2 (masking gap, P2): sessionFenceAdvanceIsBenignSync only validated the APPENDED bytes, so a writer that rewrote the existing prefix AND appended a benign delivery-mirror/gateway-injected line could be laundered as an owned advance — masking a genuine external takeover (silent message loss). Now fail closed unless the current prefix is byte-identical to the trusted readSessionFileFenceSnapshot text (readSessionFilePrefixSync); absent snapshot text => not benign. Finding 1 (provider-error masking, P2): wrappedStreamFn's finally let a reacquireAfterPrompt() takeover error mask the original provider error when the stream itself threw. Now only surface the reacquire error when the stream succeeded; otherwise preserve the original failure. tsgo core: 0 errors. 384 tests pass (benign-advance acceptance + external-mutation rejection both green). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * chore(release): 2026.5.18-boon.1 — session-takeover hardening (boon fleet build) Version bump + CHANGELOG for the fork build. Also fixes a backport test-import gap: attempt.test.ts referenced `attemptTesting` (the __testing export) without importing it. Full project typecheck (tsgo -b tsconfig.projects.json): 0 errors. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(ci): no-unsafe-finally in wrappedStreamFn + drop collateral protocol/test churn - wrappedStreamFn: restructure provider-error-preservation without a throw inside finally (oxlint no-unsafe-finally). Same semantics: always reacquire; prefer the original stream error over a reacquire takeover error; surface reacquire error only when the stream succeeded. - Revert src/gateway/server-methods/agent.test.ts + GatewayModels.swift to the 5.18 baseline: the openclaw#85764 cherry-pick conflict-resolution had pulled in openclaw#85256-era internal-session-effect tests + protocol fields whose implementation isn't in this backport, breaking checks-node-agentic-gateway-methods + checks-fast-bundled-protocol. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix: remove vestigial onAssistantErrorMessagePersisted option decls Address cubic P2 review (PR #2): the option was declared on the guard and guard-wrapper option types but never forwarded or invoked, so any provided callback was silently ignored. The companion error-suppression feature (suppressAssistantErrorPersistence + the agent-runner/followup caller chain) is deliberately scoped OUT of this 5.18 backport, so the decls were dead plumbing left over from a cherry-pick. Remove them to keep the option surface honest; the load-bearing beforeMessagePersist fence checkpoint (openclaw#86572) is retained. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Yao <364939526@qq.com> Co-authored-by: xiaotian <tianxiaochannel@gmail.com> Co-authored-by: 狼哥 <hanwanlonga@gmail.com> Co-authored-by: njuboy <njuboy11@gmail.com> Co-authored-by: njuboy11 <njuboy11@users.noreply.github.com> Co-authored-by: Peter Steinberger <steipete@gmail.com> Co-authored-by: clawsweeper[bot] <274271284+clawsweeper[bot]@users.noreply.github.com> Co-authored-by: abnershang <abner.shang@gmail.com> Co-authored-by: takhoffman <781889+takhoffman@users.noreply.github.com> Co-authored-by: Chunyue Wang <80630709+openperf@users.noreply.github.com> Co-authored-by: openperf <16864032@qq.com> Co-authored-by: Samuel Soares da Silva <samuelsoares177778@gmail.com> Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…penclaw#84321) Summary: - The PR preserves provider-facing embedded-runner prompt errors when cleanup detects session takeover, keeps the takeover signal fatal for fallback, and adds focused regressions. - PR surface: Source +52, Tests +92. Total +144 across 5 files. - Reproducibility: yes. Source inspection shows current main can let cleanup takeover replace a prior prompt/p ... rror and can normalize a provider-looking takeover wrapper before fallback sees it as coordination failure. Automerge notes: - PR branch already contained follow-up commit before automerge: fix(embedded-runner): preserve takeover during fallback - PR branch already contained follow-up commit before automerge: fix(clawsweeper): address review for automerge-openclaw-openclaw-8405… Validation: - ClawSweeper review passed for head 050c779. - Required merge gates passed before the squash merge. Prepared head SHA: 050c779 Review: openclaw#84321 (comment) Co-authored-by: abnershang <abner.shang@gmail.com> Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com> Co-authored-by: clawsweeper[bot] <274271284+clawsweeper[bot]@users.noreply.github.com> Approved-by: takhoffman Co-authored-by: takhoffman <781889+takhoffman@users.noreply.github.com>

clawsweeper Bot mentioned this pull request May 19, 2026

fix(embedded-runner): preserve provider errors on cleanup takeover #84056

Closed

openclaw-barnacle Bot removed the proof: supplied External PR includes structured after-fix real behavior proof. label May 19, 2026

clawsweeper Bot removed the clawsweeper:human-review Needs maintainer review before ClawSweeper can continue label May 26, 2026

abnershang and others added 3 commits May 26, 2026 02:58

fix(embedded-runner): preserve provider errors on cleanup takeover

83dd489

fix(embedded-runner): preserve takeover during fallback

9be6184

fix(clawsweeper): address review for automerge-openclaw-openclaw-8405…

050c779

…6 (1)

clawsweeper Bot force-pushed the clawsweeper/automerge-openclaw-openclaw-84056 branch from e7d9d8c to 050c779 Compare May 26, 2026 02:58

clawsweeper Bot added proof: supplied External PR includes structured after-fix real behavior proof. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. labels May 26, 2026

openclaw-barnacle Bot removed the proof: supplied External PR includes structured after-fix real behavior proof. label May 26, 2026

clawsweeper Bot removed the status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. label May 26, 2026

clawsweeper Bot merged commit 7fbca96 into main May 26, 2026
115 of 120 checks passed

clawsweeper Bot deleted the clawsweeper/automerge-openclaw-openclaw-84056 branch May 26, 2026 03:09

github-actions Bot mentioned this pull request May 26, 2026

📡 Upstream Digest — 2026-05-26 07:37 UTC curtismercier/openclaw-mods#949

Open

solosage1 mentioned this pull request May 26, 2026

[Bug]: EmbeddedAttemptSessionTakeoverError during Discord runs: session file changed while embedded prompt lock was released #86508

Open

augusteo mentioned this pull request May 29, 2026

2026.5.18-boon.1 — session-takeover hardening getboon/openclaw#2

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(embedded-runner): preserve provider errors on cleanup takeover#84321

fix(embedded-runner): preserve provider errors on cleanup takeover#84321
clawsweeper[bot] merged 3 commits into
mainfrom
clawsweeper/automerge-openclaw-openclaw-84056

clawsweeper Bot commented May 19, 2026

Uh oh!

clawsweeper Bot commented May 19, 2026 •

edited

Loading

Uh oh!

clawsweeper Bot commented May 19, 2026

Uh oh!

Takhoffman commented May 25, 2026

Uh oh!

clawsweeper Bot commented May 25, 2026 •

edited

Loading

Uh oh!

clawsweeper Bot commented May 25, 2026

Uh oh!

Takhoffman commented May 26, 2026

Uh oh!

clawsweeper Bot commented May 26, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

clawsweeper Bot commented May 19, 2026

Uh oh!

clawsweeper Bot commented May 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

clawsweeper Bot commented May 19, 2026

Uh oh!

Takhoffman commented May 25, 2026

Uh oh!

clawsweeper Bot commented May 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

clawsweeper Bot commented May 25, 2026

Hatch command

Uh oh!

Takhoffman commented May 26, 2026

Uh oh!

clawsweeper Bot commented May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

clawsweeper Bot commented May 19, 2026 •

edited

Loading

clawsweeper Bot commented May 25, 2026 •

edited

Loading

clawsweeper Bot commented May 26, 2026 •

edited

Loading