Skip to content

Fix Discord session recovery abort ownership#85100

Merged
joshavant merged 2 commits into
mainfrom
fix/issue-84477-discord-recovery
May 21, 2026
Merged

Fix Discord session recovery abort ownership#85100
joshavant merged 2 commits into
mainfrom
fix/issue-84477-discord-recovery

Conversation

@joshavant

Copy link
Copy Markdown
Contributor

Summary:

  • Register dispatch-owned reply operations before hooks/model work so Discord and bound ACP turns can be stopped before the embedded agent run starts.
  • Keep abort ownership available through ACP tail dispatch and final delivery, and suppress late hook/resolver dispatcher callbacks after abort.
  • Extend /stop to abort bound ACP target work plus the source dispatch lane, including queue/lane cleanup and stuck-session recovery coverage.

Verification:

  • codex --dangerously-bypass-approvals-and-sandbox --sandbox danger-full-access review --uncommitted
  • node scripts/run-vitest.mjs src/auto-reply/reply/dispatch-from-config.acp-abort.test.ts src/auto-reply/reply/abort.test.ts
  • node scripts/run-tsgo.mjs -p tsconfig.core.json --incremental --tsBuildInfoFile .artifacts/tsgo-cache/core.tsbuildinfo
  • node scripts/run-tsgo.mjs -p test/tsconfig/tsconfig.core.test.json --incremental --tsBuildInfoFile .artifacts/tsgo-cache/core-test.tsbuildinfo
  • git diff --check HEAD~1..HEAD

Behavior addressed: Discord/session-key recovery now has a source-keyed dispatch operation before hooks/model dispatch so /stop and stuck-session recovery can find and abort pre-run work; bound ACP routing remains target-keyed while abort ownership stays source-keyed.

Real environment tested: Discord live QA and ACP live bind/stop scenarios were previously run on this branch during issue proof; AWS Crabbox attempts were made but provider capacity was unavailable for the earlier runs.

Exact steps or command run after this patch: focused local proof commands listed above; earlier live proof covered Discord canary/mention/native-help, Discord status/tool-only, ACP live bind/follow-up, and stop/recovery E2E.

Evidence after fix: focused ACP/abort Vitest passed 40 tests after the final rebase resolution; core and core-test tsgo both passed; final direct Codex review exited with no accepted findings.

Observed result after fix: aborted dispatches resolve as handled without queuing late tool/block/final replies, /stop does not self-abort its own command, and bound ACP source lanes remain abortable while ACP dispatch still routes to the target session.

What was not tested: a fresh AWS Crabbox proof after the final rebase resolution, because the earlier AWS Crabbox provider attempts were unavailable; CI will provide the broad hosted proof for this pushed branch.

Fixes #84477

@openclaw-barnacle openclaw-barnacle Bot added size: XL maintainer Maintainer-authored PR labels May 21, 2026
@clawsweeper

clawsweeper Bot commented May 21, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs maintainer review before merge.

Workflow note: Future ClawSweeper reviews update this same comment in place.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

Summary
The PR registers a dispatch-owned reply operation before hook/model dispatch, threads that abort ownership through ACP tail/final delivery paths, extends /stop to cover bound ACP source and target lanes, and adds focused abort/recovery tests plus a changelog entry.

Reproducibility: yes. The linked issue includes live Discord stuck-session logs, and current-main source inspection shows dispatch work reaches hooks/resolver without a dispatch-owned reply operation that /stop or recovery can abort before the embedded run is active.

PR rating
Overall: 🐚 platinum hermit
Proof: 🐚 platinum hermit
Patch quality: 🐚 platinum hermit
Summary: The patch is broad but coherent, has targeted regression coverage, and includes usable real-behavior proof with an acknowledged final transport-proof gap.

Rank-up moves:

  • Confirm hosted CI on the exact head SHA before merge.
  • Run or attach a fresh live Discord/ACP stop proof after the final rebase if maintainers want stronger transport confidence.
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

Real behavior proof
Sufficient (live_output): The PR body includes structured real-behavior proof for prior Discord live QA and ACP live bind/stop scenarios on the branch, plus focused post-patch local abort tests; it also discloses the missing fresh AWS Crabbox rerun.

Mantis proof suggestion
A real Discord-visible stop proof would materially improve confidence that no late replies leak after abort. A maintainer can ask Mantis to capture proof by posting a new PR comment that starts with the OpenClaw Mantis account mention, followed by:

visual task: verify Discord /stop against a delayed pre-run reply clears the source lane and no late tool/block/final replies appear.

Risk before merge

  • The fix deliberately changes abort ownership and late-reply suppression across Discord, ACP-bound sessions, queue cleanup, and final delivery, so a wrong source-vs-target decision could abort the wrong lane or suppress a legitimate reply.
  • The PR body says fresh AWS Crabbox proof was not available after the final rebase resolution; focused local tests are strong, but the exact live Discord/ACP transport path still needs maintainer confidence before landing.

Maintainer options:

  1. Run fresh transport proof before landing (recommended)
    Ask for or run a fresh live Discord/ACP /stop proof on the final head that shows the source lane aborts, the bound target still routes correctly, and no late replies leak.
  2. Accept focused-test coverage
    Maintainers can land with the current focused abort/recovery tests and PR-body live-proof note if they are comfortable owning the remaining transport-proof gap.
  3. Pause if ownership semantics are disputed
    If maintainers are not aligned on source-keyed abort ownership for bound ACP turns, pause this PR and split the ownership decision from the implementation patch.

Next step before merge
No narrow automated repair is indicated; the remaining action is maintainer review of the protected, high-impact session/message-delivery change plus final proof/CI.

Security
Cleared: The diff does not add dependencies, CI/workflow execution, package-resolution changes, secret handling, or broader permissions; no concrete security or supply-chain concern surfaced.

Review details

Best possible solution:

Land the source-keyed dispatch operation approach after maintainers confirm the exact head has green CI and enough Discord/ACP stop proof for source-vs-target ownership and late-reply suppression.

Do we have a high-confidence way to reproduce the issue?

Yes. The linked issue includes live Discord stuck-session logs, and current-main source inspection shows dispatch work reaches hooks/resolver without a dispatch-owned reply operation that /stop or recovery can abort before the embedded run is active.

Is this the best way to solve the issue?

Yes, with maintainer validation. Registering one reply operation before hook/model dispatch and threading it into runPreparedReply, ACP tail dispatch, and /stop is the narrow maintainable fix for the missing abort owner; the main remaining question is live transport proof on the final head.

Label changes:

  • add P1: The linked bug wedges Discord/ACP sessions and blocks /stop/recovery for real channel workflows, matching urgent session/message delivery impact.
  • add merge-risk: 🚨 message-delivery: The PR changes when late hook/resolver callbacks and final deliveries are suppressed after abort, which can affect whether users see or lose replies.
  • add merge-risk: 🚨 session-state: The PR changes active reply-operation ownership, session IDs, queue cleanup, and bound ACP source-vs-target lane handling.
  • add proof: sufficient: Contributor real behavior proof is sufficient. The PR body includes structured real-behavior proof for prior Discord live QA and ACP live bind/stop scenarios on the branch, plus focused post-patch local abort tests; it also discloses the missing fresh AWS Crabbox rerun.
  • add rating: 🐚 platinum hermit: Current PR rating is 🐚 platinum hermit because proof is 🐚 platinum hermit, patch quality is 🐚 platinum hermit, and The patch is broad but coherent, has targeted regression coverage, and includes usable real-behavior proof with an acknowledged final transport-proof gap.
  • add status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Sufficient (live_output): The PR body includes structured real-behavior proof for prior Discord live QA and ACP live bind/stop scenarios on the branch, plus focused post-patch local abort tests; it also discloses the missing fresh AWS Crabbox rerun.

Label justifications:

  • P1: The linked bug wedges Discord/ACP sessions and blocks /stop/recovery for real channel workflows, matching urgent session/message delivery impact.
  • merge-risk: 🚨 message-delivery: The PR changes when late hook/resolver callbacks and final deliveries are suppressed after abort, which can affect whether users see or lose replies.
  • merge-risk: 🚨 session-state: The PR changes active reply-operation ownership, session IDs, queue cleanup, and bound ACP source-vs-target lane handling.
  • rating: 🐚 platinum hermit: Current PR rating is 🐚 platinum hermit because proof is 🐚 platinum hermit, patch quality is 🐚 platinum hermit, and The patch is broad but coherent, has targeted regression coverage, and includes usable real-behavior proof with an acknowledged final transport-proof gap.
  • status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Sufficient (live_output): The PR body includes structured real-behavior proof for prior Discord live QA and ACP live bind/stop scenarios on the branch, plus focused post-patch local abort tests; it also discloses the missing fresh AWS Crabbox rerun.
  • proof: sufficient: Contributor real behavior proof is sufficient. The PR body includes structured real-behavior proof for prior Discord live QA and ACP live bind/stop scenarios on the branch, plus focused post-patch local abort tests; it also discloses the missing fresh AWS Crabbox rerun.

What I checked:

  • Protected label: The provided GitHub context lists the PR with the protected maintainer label, so conservative cleanup must keep it open for explicit maintainer handling.
  • Patch scope: The PR commit adds the abort ownership implementation and focused regression coverage across dispatch, abort handling, get-reply-run, stuck-session recovery, and changelog files. (af1dd6e49985)
  • Dispatch operation registration: The PR version creates a source/target-aware dispatch reply operation and abort-aware dispatcher before before_dispatch, reply_dispatch, and resolver work can run. (src/auto-reply/reply/dispatch-from-config.ts:856, 0e56380838c7)
  • Late delivery suppression: The PR version wraps hook dispatchers and resolver callbacks with abort checks so late tool, block, and final replies do not enqueue after the operation aborts. (src/auto-reply/reply/dispatch-from-config.ts:917, 0e56380838c7)
  • Bound ACP stop handling: The abort path now resolves bound ACP targets, aborts both target and source lanes when appropriate, and avoids persisting source-conversation cutoff metadata onto unrelated ACP target sessions. (src/auto-reply/reply/abort.ts:339, 0e56380838c7)
  • Regression coverage: The new tests cover prompt abort, tail dispatch abort, late hook/resolver suppression, native command source-vs-target ownership, and pre-run stuck-session recovery cleanup. (src/auto-reply/reply/dispatch-from-config.acp-abort.test.ts:404, 0e56380838c7)

Likely related people:

  • Sally O'Malley: Current-main blame for the central dispatch, abort, and reply-run registry lines resolves mostly to the grafted current-main commit that carries these auto-reply paths in this checkout. (role: introduced current-main behavior; confidence: medium; commits: e72f60192571; files: src/auto-reply/reply/dispatch-from-config.ts, src/auto-reply/reply/abort.ts, src/auto-reply/reply/reply-run-registry.ts)
  • clawsweeper[bot]: Recent current-main work changed dispatch diagnostics in the same dispatch-from-config path that this PR extends. (role: recent area contributor; confidence: medium; commits: 5955f354f74d; files: src/auto-reply/reply/dispatch-from-config.ts)
  • steipete: The bounded history search for tryFastAbortFromMessage shows prior abort-related fixes and test isolation work by Peter Steinberger in this area. (role: prior abort-path contributor; confidence: medium; commits: 5898304fa049, 9924627f49ca, e510042870cf; files: src/auto-reply/reply/abort.ts, src/auto-reply/reply/dispatch-from-config.ts)
  • vincentkoc: The bounded history search shows prior inbound dispatch startup and seam-extraction work touching the same auto-reply dispatch surface. (role: adjacent area contributor; confidence: medium; commits: 5369ea53bee3, 7308e72fac98; files: src/auto-reply/reply/abort.ts, src/auto-reply/reply/dispatch-from-config.ts)

Codex review notes: model gpt-5.5, reasoning high; reviewed against 26e64bda1485.

@clawsweeper clawsweeper Bot added proof: sufficient ClawSweeper judged the real behavior proof convincing. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. P1 High-priority user-facing bug, regression, or broken workflow. merge-risk: 🚨 message-delivery 🚨 May drop, duplicate, misroute, suppress, or wrongly target messages. merge-risk: 🚨 session-state 🚨 May lose, corrupt, stale, or mis-associate session, agent, or context state. labels May 21, 2026
@clawsweeper

clawsweeper Bot commented May 21, 2026

Copy link
Copy Markdown
Contributor

ClawSweeper PR egg

✨ Hatched: 🌱 uncommon Frosted Review Wisp

Hatch command

Comment @clawsweeper hatch when this PR is hatchable.

Hatchability rules:

  • Merged PRs are hatchable.
  • Open PRs are hatchable when they are status: 👀 ready for maintainer look, status: 🚀 automerge armed, or labeled clawsweeper:automerge.
  • Closed unmerged PRs are hatchable only when one of those hatchable labels is still present in the durable record.

Rarity: 🌱 uncommon.
Trait: finds missing screenshots.
Image traits: location workflow harbor; accessory miniature diff map; palette sunrise gold and clean white; mood patient; pose nestled inside a glowing shell; shell soft speckled shell; lighting clean product lighting; background soft code-shaped tiles.
Share on X: post this hatch
Copy: My PR egg hatched a 🌱 uncommon Frosted Review Wisp in ClawSweeper.

What is this egg doing here?
  • Eggs appear after the PR passes real-behavior proof. It is here for vibes, not verdicts: it does not change labels, ratings, merge decisions, or automation.
  • The shell reacts to review momentum: open follow-up work warms it up, re-review makes it wobble, and a clean final review lets it hatch.
  • Hatchability usually comes from sufficient real-behavior proof, no blocking P0/P1/P2 findings, no security attention needed, and clean correctness. A merged PR is already final, so merge makes the egg hatchable independently.
  • The hatch is seeded from this repository and PR number, so the same PR keeps the same creature; the reviewed head SHA can only change safe visual details.
  • Rarity is just collectible sparkle: 🥚 common, 🌱 uncommon, 💎 rare, ✨ glimmer, and 🌈 legendary.

@joshavant joshavant merged commit 0ab1449 into main May 21, 2026
142 of 149 checks passed
@joshavant joshavant deleted the fix/issue-84477-discord-recovery branch May 21, 2026 21:34
SebTardif pushed a commit to SebTardif/openclaw that referenced this pull request May 24, 2026
* fix auto-reply abort ownership

* add changelog for openclaw#85100
SebTardif pushed a commit to SebTardif/openclaw that referenced this pull request May 24, 2026
* fix auto-reply abort ownership

* add changelog for openclaw#85100
SebTardif pushed a commit to SebTardif/openclaw that referenced this pull request May 24, 2026
* fix auto-reply abort ownership

* add changelog for openclaw#85100
github-actions Bot pushed a commit to Desicool/openclaw that referenced this pull request May 24, 2026
* fix auto-reply abort ownership

* add changelog for openclaw#85100
galiniliev pushed a commit to galiniliev/openclaw that referenced this pull request May 25, 2026
* fix auto-reply abort ownership

* add changelog for openclaw#85100
SebTardif pushed a commit to SebTardif/openclaw that referenced this pull request May 26, 2026
* fix auto-reply abort ownership

* add changelog for openclaw#85100
SebTardif pushed a commit to SebTardif/openclaw that referenced this pull request May 26, 2026
* fix auto-reply abort ownership

* add changelog for openclaw#85100
SebTardif pushed a commit to SebTardif/openclaw that referenced this pull request May 26, 2026
* fix auto-reply abort ownership

* add changelog for openclaw#85100
jameslcowan pushed a commit to jameslcowan/openclaw that referenced this pull request Jun 2, 2026
* fix auto-reply abort ownership

* add changelog for openclaw#85100
SYU8384 pushed a commit to SYU8384/openclaw that referenced this pull request Jun 3, 2026
* fix auto-reply abort ownership

* add changelog for openclaw#85100
sablehead pushed a commit to sablehead/openclaw that referenced this pull request Jun 10, 2026
* fix auto-reply abort ownership

* add changelog for openclaw#85100
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

maintainer Maintainer-authored PR merge-risk: 🚨 message-delivery 🚨 May drop, duplicate, misroute, suppress, or wrongly target messages. merge-risk: 🚨 session-state 🚨 May lose, corrupt, stale, or mis-associate session, agent, or context state. P1 High-priority user-facing bug, regression, or broken workflow. proof: sufficient ClawSweeper judged the real behavior proof convincing. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. size: XL status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Discord embedded-run prep wedge before strict-agentic, recovery skips sessionId=unknown lanes

1 participant