-
-
Notifications
You must be signed in to change notification settings - Fork 79.2k
Telegram isolated ingress spool can remain blocked by stale .processing claim after gateway recreate #84674
Copy link
Copy link
Open
Labels
P1High-priority user-facing bug, regression, or broken workflow.High-priority user-facing bug, regression, or broken workflow.clawsweeper:fix-shape-clearClawSweeper found a clear likely implementation shape for this issue.ClawSweeper found a clear likely implementation shape for this issue.clawsweeper:queueable-fixClawSweeper marked this issue as an existing queue_fix_pr work candidate.ClawSweeper marked this issue as an existing queue_fix_pr work candidate.clawsweeper:source-reproClawSweeper found a high-confidence source-level issue reproduction.ClawSweeper found a high-confidence source-level issue reproduction.impact:message-lossChannel message delivery can be lost, duplicated, or misrouted.Channel message delivery can be lost, duplicated, or misrouted.impact:session-stateSession, memory, transcript, context, or agent state can drift or corrupt.Session, memory, transcript, context, or agent state can drift or corrupt.issue-rating: 🦞 diamond lobsterVery strong issue quality with high-confidence source-level or clear reproduction.Very strong issue quality with high-confidence source-level or clear reproduction.
Metadata
Metadata
Assignees
Labels
P1High-priority user-facing bug, regression, or broken workflow.High-priority user-facing bug, regression, or broken workflow.clawsweeper:fix-shape-clearClawSweeper found a clear likely implementation shape for this issue.ClawSweeper found a clear likely implementation shape for this issue.clawsweeper:queueable-fixClawSweeper marked this issue as an existing queue_fix_pr work candidate.ClawSweeper marked this issue as an existing queue_fix_pr work candidate.clawsweeper:source-reproClawSweeper found a high-confidence source-level issue reproduction.ClawSweeper found a high-confidence source-level issue reproduction.impact:message-lossChannel message delivery can be lost, duplicated, or misrouted.Channel message delivery can be lost, duplicated, or misrouted.impact:session-stateSession, memory, transcript, context, or agent state can drift or corrupt.Session, memory, transcript, context, or agent state can drift or corrupt.issue-rating: 🦞 diamond lobsterVery strong issue quality with high-confidence source-level or clear reproduction.Very strong issue quality with high-confidence source-level or clear reproduction.
Type
Fields
Give feedbackNo fields configured for issues without a type.
Summary
A Telegram isolated polling ingress spool entry can remain stuck as
*.json.processingand block all later Telegram updates in the same spool. In my case the stale processing claim survived a gateway recreate and was not recovered automatically. Moving the stuck processing file aside manually allowed the next pending Telegram update to be claimed and processed.This presents as Telegram showing typing / accepting messages, but no replies being delivered. It can look like general gateway slowness or Codex latency, but the concrete failure was a blocked Telegram spool.
Environment
2026.5.18node dist/index.js gateway --bind lan --port 18789openai-codex/gpt-5.5Observed state
The Telegram spool directory contained:
The stuck file had a claim from the previous gateway process:
{ "updateId": 169729588, "claim": { "processId": "8:a2d7a994-b8c3-4a74-85ab-1c33da3d4490", "processPid": 8, "claimedAt": 1779301818735 } }That
*.json.processingfile had been claimed at2026-05-20T18:30:18Zand remained present long after the turn stopped making progress. Later updates stayed queued as plain.jsonfiles.A gateway recreate did not recover this claim. After recreate, the file was still present as
.json.processing, while later updates remained pending.Manual recovery that worked
I did not delete the file. I copied and renamed it aside:
After this, OpenClaw immediately claimed the next pending update:
That turn eventually completed, and the later pending update was processed as well. The affected Telegram topic session returned to
done, and no plain pending.jsonfiles remained.Important diagnostic detail
I initially suspected CPU or Docker overhead. A 120s Node CPU profile during the blocked period showed the gateway mostly idle, roughly 93% idle samples. This made the spool state the useful signal: Telegram ingress had accepted messages, but processing was blocked behind the stale
.processingfile.A separate run after unblocking did consume CPU and eventually completed, so there seem to be two distinguishable cases:
.processingclaim blocks the Telegram spool even though gateway is not busy;Expected behavior
On gateway startup / polling ingress startup, OpenClaw should recover stale Telegram spool claims when the recorded claimant process is no longer valid for the current gateway instance, or when the claim is older than the configured stale timeout.
At minimum, a stale
*.json.processingfile should not indefinitely block later Telegram updates after a gateway recreate.Actual behavior
The stale
.processingfile survived gateway recreate and continued to block later updates until manually moved aside.Suggested fix
For the isolated Telegram ingress spool recovery path:
*.json.processingon provider startup and/or periodically;claimmetadata;processIddoes not match the current gateway process identity, orprocessPidis not the active gateway process for this container, orclaimedAtexceeds the configured processing timeout;.processing.stale-<timestamp>or return them to queue if safe;.jsonupdates forever.This may be related to earlier Telegram spool timeout recovery work, but this specific case is about a
*.json.processingclaim surviving gateway recreate and not being recovered automatically.