Summary
The gateway enters a permanent 409 getUpdates conflict loop because the Telegram probe client and polling client create competing getUpdates connections. Once triggered, the loop is self-sustaining because the long-poll timeout (30s) equals the max retry interval (30s), so each retry's server-side connection overlaps with the next.
Reproduction
- Start the gateway with Telegram polling enabled (
channels.telegram.enabled: true)
- Observe startup logs — two
autoSelectFamily + dnsResultOrder log pairs appear (one from probe, one from polling)
- Within 30-60s,
getUpdates conflict: 409 errors begin
- Errors continue indefinitely at ~30s intervals
Root Cause
Dual client creation:
Each call to resolveTelegramTransport() in fetch.ts creates a new dispatcher with no caching. The probe's TCP connection lingers in the socket pool and can race with the polling client's getUpdates call.
Self-sustaining loop:
createTelegramRunnerOptions() sets fetch.timeout: 30 (30s long-poll)
- Grammy-runner's max retry interval is also 30s
- When a 409 occurs, the retry fires a new
getUpdates while the previous call's server-side connection is still alive (within the 30s window)
- This creates a permanent overlap: each retry conflicts with the previous retry
Health-monitor re-trigger:
- Even if the initial 409 self-resolves, the health-monitor (300s interval) re-probes via
probeTelegram(), creating a fresh competing connection that re-triggers the loop
Suggested Fixes
Fix A — Transport caching (primary):
Add a cache to resolveTelegramTransport() in fetch.ts, similar to the existing probeFetcherCache in probe.ts. This ensures probe and polling share the same dispatcher/connection pool.
Fix B — Break the 30s=30s deadlock:
Reduce fetch.timeout in createTelegramRunnerOptions() to a value less than the max retry interval (e.g., 10-15s). This ensures the previous server-side connection expires before the next retry fires.
Fix C — Skip probe before polling:
In startAccount(), skip the probeTelegram() call when the provider is about to start polling immediately. The probe is useful for health checks but redundant right before monitorTelegramProvider().
Workaround
Runtime patch reducing fetch.timeout from 30 to 10 in createTelegramRunnerOptions() resolves the self-sustaining loop. An initial 409 may still occur but recovers within one retry cycle.
Environment
- OpenClaw latest (
0537f3e59)
- Docker on Windows 10 (gateway runs in Linux container)
- Single bot token, single container, no webhook
- Telegram plugin with polling mode
🤖 Generated with Claude Code
Summary
The gateway enters a permanent 409
getUpdatesconflict loop because the Telegram probe client and polling client create competinggetUpdatesconnections. Once triggered, the loop is self-sustaining because the long-poll timeout (30s) equals the max retry interval (30s), so each retry's server-side connection overlaps with the next.Reproduction
channels.telegram.enabled: true)autoSelectFamily+dnsResultOrderlog pairs appear (one from probe, one from polling)getUpdates conflict: 409errors beginRoot Cause
Dual client creation:
channel.tsstartAccount()(line ~489) callsprobeTelegram()→resolveTelegramTransport()— creates Client fix: add @lid format support and allowFrom wildcard handling #1monitor.tsmonitorTelegramProvider()createsTelegramPollingSession→createTelegramBot()→resolveTelegramTransport()— creates Client Login fails with 'WebSocket Error (socket hang up)' ECONNRESET #2Each call to
resolveTelegramTransport()infetch.tscreates a new dispatcher with no caching. The probe's TCP connection lingers in the socket pool and can race with the polling client'sgetUpdatescall.Self-sustaining loop:
createTelegramRunnerOptions()setsfetch.timeout: 30(30s long-poll)getUpdateswhile the previous call's server-side connection is still alive (within the 30s window)Health-monitor re-trigger:
probeTelegram(), creating a fresh competing connection that re-triggers the loopSuggested Fixes
Fix A — Transport caching (primary):
Add a cache to
resolveTelegramTransport()infetch.ts, similar to the existingprobeFetcherCacheinprobe.ts. This ensures probe and polling share the same dispatcher/connection pool.Fix B — Break the 30s=30s deadlock:
Reduce
fetch.timeoutincreateTelegramRunnerOptions()to a value less than the max retry interval (e.g., 10-15s). This ensures the previous server-side connection expires before the next retry fires.Fix C — Skip probe before polling:
In
startAccount(), skip theprobeTelegram()call when the provider is about to start polling immediately. The probe is useful for health checks but redundant right beforemonitorTelegramProvider().Workaround
Runtime patch reducing
fetch.timeoutfrom 30 to 10 increateTelegramRunnerOptions()resolves the self-sustaining loop. An initial 409 may still occur but recovers within one retry cycle.Environment
0537f3e59)🤖 Generated with Claude Code