-
-
Notifications
You must be signed in to change notification settings - Fork 79.1k
Gateway restart can silently abort an in-flight Discord turn, with no automatic recovery message to the user #69249
Copy link
Copy link
Closed
Labels
P1High-priority user-facing bug, regression, or broken workflow.High-priority user-facing bug, regression, or broken workflow.clawsweeper:fix-shape-clearClawSweeper found a clear likely implementation shape for this issue.ClawSweeper found a clear likely implementation shape for this issue.clawsweeper:queueable-fixClawSweeper marked this issue as an existing queue_fix_pr work candidate.ClawSweeper marked this issue as an existing queue_fix_pr work candidate.clawsweeper:source-reproClawSweeper found a high-confidence source-level issue reproduction.ClawSweeper found a high-confidence source-level issue reproduction.impact:message-lossChannel message delivery can be lost, duplicated, or misrouted.Channel message delivery can be lost, duplicated, or misrouted.impact:session-stateSession, memory, transcript, context, or agent state can drift or corrupt.Session, memory, transcript, context, or agent state can drift or corrupt.issue-rating: 🦞 diamond lobsterVery strong issue quality with high-confidence source-level or clear reproduction.Very strong issue quality with high-confidence source-level or clear reproduction.
Metadata
Metadata
Assignees
Labels
P1High-priority user-facing bug, regression, or broken workflow.High-priority user-facing bug, regression, or broken workflow.clawsweeper:fix-shape-clearClawSweeper found a clear likely implementation shape for this issue.ClawSweeper found a clear likely implementation shape for this issue.clawsweeper:queueable-fixClawSweeper marked this issue as an existing queue_fix_pr work candidate.ClawSweeper marked this issue as an existing queue_fix_pr work candidate.clawsweeper:source-reproClawSweeper found a high-confidence source-level issue reproduction.ClawSweeper found a high-confidence source-level issue reproduction.impact:message-lossChannel message delivery can be lost, duplicated, or misrouted.Channel message delivery can be lost, duplicated, or misrouted.impact:session-stateSession, memory, transcript, context, or agent state can drift or corrupt.Session, memory, transcript, context, or agent state can drift or corrupt.issue-rating: 🦞 diamond lobsterVery strong issue quality with high-confidence source-level or clear reproduction.Very strong issue quality with high-confidence source-level or clear reproduction.
Type
Fields
Give feedbackNo fields configured for issues without a type.
Summary
When the OpenClaw gateway restarts during an active Discord-backed turn, the in-flight reply can abort, but the user may receive no visible recovery message in the channel. From the user perspective, the assistant appears to ghost until they manually ping again.
Environment
OpenClaw 2026.4.15 (041266a)agent:main:discord:*)openai-codex/gpt-5.4Expected behavior
If a gateway restart interrupts an in-flight Discord turn, OpenClaw should do one of these automatically:
The user should not have to manually ping the bot just to learn that the previous turn died.
Actual behavior
Reproduction outline
I do not have a minimal isolated repo repro yet, but the behavior appears reproducible with this sequence:
Evidence from the incident
In the affected session transcript, the interrupted run recorded a prompt error with an aborted operation, and the gateway restart timestamp matched the abort window.
Representative evidence from the local incident:
openclaw:prompt-errorThis operation was abortedI am intentionally not pasting private local paths or full local logs here, but I can provide redacted excerpts if maintainers want them.
Why this matters
A gateway restart is operationally recoverable, but the current behavior makes it look like the assistant silently stopped responding. That turns an internal restart into a user-facing ghosting incident.
Suspected area
This looks like a recovery gap after an interrupted in-flight turn, not just a Discord transport issue.
The failure mode seems to be:
Suggested fixes
One or more of these would solve the user-facing problem:
Additional note
I added a local workaround that scans for aborted turns with no visible follow-up and posts a recovery message. That mitigates the symptom locally, but it feels like the platform should handle this natively.