Skip to content

Telegram polling: self-sustaining 409 getUpdates conflict from probe + health-monitor re-triggering transport #50064

@RIPRODUCTIONS

Description

@RIPRODUCTIONS

Summary

The gateway enters a permanent 409 getUpdates conflict loop because the Telegram probe client and polling client create competing getUpdates connections. Once triggered, the loop is self-sustaining because the long-poll timeout (30s) equals the max retry interval (30s), so each retry's server-side connection overlaps with the next.

Reproduction

  1. Start the gateway with Telegram polling enabled (channels.telegram.enabled: true)
  2. Observe startup logs — two autoSelectFamily + dnsResultOrder log pairs appear (one from probe, one from polling)
  3. Within 30-60s, getUpdates conflict: 409 errors begin
  4. Errors continue indefinitely at ~30s intervals

Root Cause

Dual client creation:

Each call to resolveTelegramTransport() in fetch.ts creates a new dispatcher with no caching. The probe's TCP connection lingers in the socket pool and can race with the polling client's getUpdates call.

Self-sustaining loop:

  • createTelegramRunnerOptions() sets fetch.timeout: 30 (30s long-poll)
  • Grammy-runner's max retry interval is also 30s
  • When a 409 occurs, the retry fires a new getUpdates while the previous call's server-side connection is still alive (within the 30s window)
  • This creates a permanent overlap: each retry conflicts with the previous retry

Health-monitor re-trigger:

  • Even if the initial 409 self-resolves, the health-monitor (300s interval) re-probes via probeTelegram(), creating a fresh competing connection that re-triggers the loop

Suggested Fixes

Fix A — Transport caching (primary):
Add a cache to resolveTelegramTransport() in fetch.ts, similar to the existing probeFetcherCache in probe.ts. This ensures probe and polling share the same dispatcher/connection pool.

Fix B — Break the 30s=30s deadlock:
Reduce fetch.timeout in createTelegramRunnerOptions() to a value less than the max retry interval (e.g., 10-15s). This ensures the previous server-side connection expires before the next retry fires.

Fix C — Skip probe before polling:
In startAccount(), skip the probeTelegram() call when the provider is about to start polling immediately. The probe is useful for health checks but redundant right before monitorTelegramProvider().

Workaround

Runtime patch reducing fetch.timeout from 30 to 10 in createTelegramRunnerOptions() resolves the self-sustaining loop. An initial 409 may still occur but recovers within one retry cycle.

Environment

  • OpenClaw latest (0537f3e59)
  • Docker on Windows 10 (gateway runs in Linux container)
  • Single bot token, single container, no webhook
  • Telegram plugin with polling mode

🤖 Generated with Claude Code

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions