Skip to content

Windows: Telegram unusable from .24 through .29 — event loop saturation, polling stalls, bonjour crash #75687

@vdruts

Description

@vdruts

Summary

Every OpenClaw release from 2026.4.24 through 2026.4.29 is broken on Windows with multi-bot Telegram setups. Each version introduces or retains different regressions that prevent Telegram agents from functioning. Last known good: 2026.4.23.

Setup: 8 Telegram bots, 9 plugins, Windows 11 Pro N, Node 22.21.1, Pinokio launcher.


Bug 1: Event loop saturation kills all Telegram connections (.29)

Severity: Showstopper — gateway boots, HTTP/WS works, but Telegram never connects. Agents show "typing" but responses never deliver.

What happens

On startup, the event loop immediately saturates to 100% utilization and stays there permanently:

[diagnostic] liveness warning: reasons=event_loop_delay,event_loop_utilization,cpu
  interval=33s eventLoopDelayP99Ms=30048 eventLoopDelayMaxMs=30048
  eventLoopUtilization=1 cpuCoreRatio=1.029 active=0 waiting=0 queued=0

Note active=0 waiting=0 queued=0 — there are zero active agent sessions. The event loop is pegged purely from infrastructure overhead (plugin loading + Telegram provider startup).

All 8 Telegram providers fail simultaneously:

[fetch-timeout] fetch timeout reached; aborting operation
[telegram] [alan] starting provider
[fetch-timeout] fetch timeout reached; aborting operation
[telegram] [alex] starting provider
... (repeats for all 8 bots)

20-30 seconds later, the built-in IPv4 fallback fires but also fails because the event loop is still pegged:

[telegram] fetch fallback: enabling sticky IPv4-only dispatcher (codes=none, reason=request-timeout)
[telegram] fetch fallback: DNS-resolved IP unreachable; trying alternative Telegram API IP

Then all WS connections time out and Telegram providers die permanently:

[ws] handshake timeout conn=... peer=127.0.0.1:...->127.0.0.1:18789
[telegram] connect error: gateway closed (1000):
[telegram] failed to start native approval handler: Error: gateway closed: 1000

No recovery occurs. Auto-restarts (up to 5) all exhibit the same pattern.

Root cause theory

The .29 plugin loading phase (especially acpx runtime dep materialization of 46 specs) takes 20-30s on Windows and blocks the event loop completely. Telegram providers start during this window, their initial API calls can't execute, and they fail without retry. In .23, plugin loading was lighter and completed before Telegram providers needed the event loop.

Reproduction

  1. Windows 11, Node 22
  2. Configure 8 Telegram bot accounts in channels.telegram.accounts
  3. Enable 9+ plugins (default set: acpx, bonjour, browser, device-pair, file-transfer, memory-core, phone-control, talk-voice, telegram)
  4. npm i -g openclaw@2026.4.29 && openclaw gateway run
  5. Watch eventLoopUtilization hit 1.0 and stay there
  6. All Telegram providers fail, never recover

Suggested fix

Defer Telegram provider startup until event loop utilization drops below a threshold (e.g. 0.5), or stagger bot connections instead of starting all 8 simultaneously.


Bug 2: Bonjour plugin crashes gateway on first boot (.29, Windows)

The @homebridge/ciao mDNS library throws an unhandled error that kills the gateway process:

Error: Can't probe for a service which is announced already.
Received announcing for service Perplexity (OpenClaw) (2)._openclaw-gw._tcp.local.
    at Responder.probe (.../node_modules/@homebridge/ciao/src/Responder.ts:444:13)

This forces a restart on every first boot. Combined with Bug 1, the gateway crash-loops through all 5 restart attempts.

Workaround: plugins.entries.bonjour.enabled = false


Bug 3: Telegram polling stalls and sendMessage failures (.24, .25)

Already reported separately but including for completeness since all versions remain broken.

[telegram] Polling stall detected (active getUpdates stuck for 138.99s); forcing restart.
[telegram] sendMessage failed: Network request for 'sendMessage' failed!

Tight restart loops, agents appear unresponsive. First message after boot may sneak through, then nothing.


Bug 4: Model fallback cascade — 30s per invalid candidate (.24, .25)

Non-retryable 400 errors (e.g. invalid model ID) take ~30s each before fallback fires. A 3-entry chain stalls for 90+ seconds. In .23, the same chain failed over in seconds. The 30s suggests an internal retry/timeout wrapping deterministic errors.


Bug 5: Windows ESM regression (.24)

Telegram channel loader passes raw C:\... paths to Node 22's strict ESM loader:

Received protocol 'c:'

All agents fail to start.


Environment

openclaw: tested .24, .25, .29 (all broken) — rolled back to .23 (working)
node: 22.21.1
os: Windows 11 Pro N (10.0.26200)
platform: win32
shell: bash (Git Bash via Pinokio)
bots: 8 Telegram accounts
plugins: acpx, bonjour, browser, device-pair, file-transfer, memory-core, phone-control, talk-voice, telegram
launcher: Pinokio (clawdbot)

Notes

  • Full gateway logs available on request for any version
  • .23 rollback immediately restores all functionality every time
  • doctor --fix migrations (TTS edge→microsoft, streaming scalar→object, nano-banana→imageGenerationModel) applied cleanly on .29 before the gateway boot failures
  • The .29 event loop issue reproduces on clean config (backup-restored .23 config) — it's not caused by config migration artifacts
  • force-ipv4.js (dns.lookup monkey-patch + --dns-result-order=ipv4first) confirmed loading in the gateway process — the fetch timeouts are from event loop starvation, not DNS

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions