Skip to content

Telegram retry regex too strict: bare grammy Network request for 'X' failed! (no "after") never classified as recoverable for context: send, drops outbound messages #80362

@charlie-morrison

Description

@charlie-morrison

Summary

isRecoverableTelegramNetworkError(err, { context: "send" }) in extensions/telegram/src/network-errors.ts (compiled dist/send-*.js) fails to classify grammy's bare pre-connect form Network request for 'sendChatAction' failed! as recoverable — so the per-request shouldRetry predicate in createTelegramRequestWithDiag returns false and the user-visible reply is silently dropped.

Root cause

The regex used to whitelist grammy network errors only matches the post-connect form:

// dist/send-BZcD66uc.js (compiled from extensions/telegram/src/network-errors.ts)
const GRAMMY_NETWORK_REQUEST_FAILED_AFTER_RE =
    /^network request(?:\s+for\s+["']?[^"']+["']?)?\s+failed\s+after\b.*[!.]?$/i;

…but grammy raises two distinct forms:

Form Example message Meaning
Pre-connect Network request for 'sendChatAction' failed! fetch aborted before bytes hit the wire (event-loop starvation timer, dead-pool socket synchronous error, DNS failure). Safe to retry — message did NOT reach Telegram.
Post-connect Network request for 'sendChatAction' failed after 5 attempts: ... grammy's own retry budget exhausted. Risky to retry — message MAY have been delivered.

The current regex requires the literal failed\s+after\b, so the pre-connect form (the safer one to retry) is rejected, while the post-connect form (the riskier one) passes.

For context: "send", allowMessageMatch=false is set deliberately to avoid retrying the post-connect form and producing duplicate user messages. That's correct for the post-connect case — but it also skips the snippet check RECOVERABLE_MESSAGE_SNIPPETS (which DOES contain "network request"), leaving only the broken regex as the gatekeeper.

Net effect: a single transient socket-pool failure during event-loop starvation drops the user's reply with no retry.

Reproduction

  1. Run openclaw gateway under CPU pressure on a low-resource machine (we hit it on a 2GB Asus E200HA with concurrent model calls).
  2. Observe periodic [diagnostic] liveness warning: ... eventLoopDelayMaxMs=15000 and accompanying [fetch-timeout] timer delayed 33208ms, likely event-loop starvation.
  3. Around the starvation window, watch grammy raise Network request for 'sendChatAction' failed! (no "after").
  4. Confirm [telegram] message processing failed: HttpError: Network request for 'sendMessage' failed! — single attempt, no retry.

Repro on our box: 2026-05-10 20:15:19-20:17:47 — 39 send failures over 140s, all single-attempt, no retry log lines. v2026.5.3-1.

Suggested fix

Make the post-connect after ... clause optional — the regex still matches both forms safely:

- const GRAMMY_NETWORK_REQUEST_FAILED_AFTER_RE =
-     /^network request(?:\s+for\s+["']?[^"']+["']?)?\s+failed\s+after\b.*[!.]?$/i;
+ const GRAMMY_NETWORK_REQUEST_FAILED_RE =
+     /^network request(?:\s+for\s+["']?[^"']+["']?)?\s+failed(?:\s+after\b.*)?[!.]?$/i;

Test cases (all should match):

Network request for 'sendChatAction' failed!              ← pre-connect, currently DROPS
Network request for 'sendChatAction' failed after 5 attempts.  ← post-connect, currently matches
Network request failed                                    ← bare, currently DROPS
Network request for getMe failed.                         ← currently DROPS

Negative cases (should NOT match):

telegram returned 401 unauthorized
ETIMEDOUT
completely unrelated error message

Verified locally — old regex matches 1/4 positive cases; new regex matches 4/4 with no false positives.

Why this is safe

The pre-connect form happens specifically when grammy's internal fetch aborts before any bytes are written — either via AbortSignal (timeout firing) or synchronously from a dead undici pool socket. In both cases the request never reached api.telegram.org, so retry cannot create a duplicate.

Related

Workaround

Patch dist/send-*.js post-install (we wired this into a post-openclaw-upgrade.sh hook to survive package updates).

Metadata

Metadata

Assignees

No one assigned

    Labels

    P1High-priority user-facing bug, regression, or broken workflow.clawsweeper:linked-pr-openClawSweeper found an open linked pull request for this issue.clawsweeper:needs-product-decisionClawSweeper marked this issue as needing a product or behavior decision.clawsweeper:no-new-fix-prClawSweeper does not recommend queueing a new automated fix PR for this issue.clawsweeper:source-reproClawSweeper found a high-confidence source-level issue reproduction.impact:message-lossChannel message delivery can be lost, duplicated, or misrouted.issue-rating: 🦞 diamond lobsterVery strong issue quality with high-confidence source-level or clear reproduction.

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions