Windows: Telegram unusable from .24 through .29 — event loop saturation, polling stalls, bonjour crash

## Summary

Every OpenClaw release from `2026.4.24` through `2026.4.29` is broken on Windows with multi-bot Telegram setups. Each version introduces or retains different regressions that prevent Telegram agents from functioning. Last known good: **`2026.4.23`**.

Setup: 8 Telegram bots, 9 plugins, Windows 11 Pro N, Node 22.21.1, Pinokio launcher.

---

## Bug 1: Event loop saturation kills all Telegram connections (`.29`)

**Severity: Showstopper** — gateway boots, HTTP/WS works, but Telegram never connects. Agents show "typing" but responses never deliver.

### What happens

On startup, the event loop immediately saturates to **100% utilization** and stays there permanently:

```
[diagnostic] liveness warning: reasons=event_loop_delay,event_loop_utilization,cpu
  interval=33s eventLoopDelayP99Ms=30048 eventLoopDelayMaxMs=30048
  eventLoopUtilization=1 cpuCoreRatio=1.029 active=0 waiting=0 queued=0
```

Note `active=0 waiting=0 queued=0` — there are zero active agent sessions. The event loop is pegged purely from infrastructure overhead (plugin loading + Telegram provider startup).

All 8 Telegram providers fail simultaneously:

```
[fetch-timeout] fetch timeout reached; aborting operation
[telegram] [alan] starting provider
[fetch-timeout] fetch timeout reached; aborting operation
[telegram] [alex] starting provider
... (repeats for all 8 bots)
```

20-30 seconds later, the built-in IPv4 fallback fires but also fails because the event loop is still pegged:

```
[telegram] fetch fallback: enabling sticky IPv4-only dispatcher (codes=none, reason=request-timeout)
[telegram] fetch fallback: DNS-resolved IP unreachable; trying alternative Telegram API IP
```

Then all WS connections time out and Telegram providers die permanently:

```
[ws] handshake timeout conn=... peer=127.0.0.1:...->127.0.0.1:18789
[telegram] connect error: gateway closed (1000):
[telegram] failed to start native approval handler: Error: gateway closed: 1000
```

No recovery occurs. Auto-restarts (up to 5) all exhibit the same pattern.

### Root cause theory

The `.29` plugin loading phase (especially `acpx` runtime dep materialization of 46 specs) takes 20-30s on Windows and blocks the event loop completely. Telegram providers start during this window, their initial API calls can't execute, and they fail without retry. In `.23`, plugin loading was lighter and completed before Telegram providers needed the event loop.

### Reproduction

1. Windows 11, Node 22
2. Configure 8 Telegram bot accounts in `channels.telegram.accounts`
3. Enable 9+ plugins (default set: acpx, bonjour, browser, device-pair, file-transfer, memory-core, phone-control, talk-voice, telegram)
4. `npm i -g openclaw@2026.4.29 && openclaw gateway run`
5. Watch `eventLoopUtilization` hit 1.0 and stay there
6. All Telegram providers fail, never recover

### Suggested fix

Defer Telegram provider startup until event loop utilization drops below a threshold (e.g. 0.5), or stagger bot connections instead of starting all 8 simultaneously.

---

## Bug 2: Bonjour plugin crashes gateway on first boot (`.29`, Windows)

The `@homebridge/ciao` mDNS library throws an unhandled error that kills the gateway process:

```
Error: Can't probe for a service which is announced already.
Received announcing for service Perplexity (OpenClaw) (2)._openclaw-gw._tcp.local.
    at Responder.probe (.../node_modules/@homebridge/ciao/src/Responder.ts:444:13)
```

This forces a restart on every first boot. Combined with Bug 1, the gateway crash-loops through all 5 restart attempts.

**Workaround:** `plugins.entries.bonjour.enabled = false`

---

## Bug 3: Telegram polling stalls and sendMessage failures (`.24`, `.25`)

Already reported separately but including for completeness since all versions remain broken.

```
[telegram] Polling stall detected (active getUpdates stuck for 138.99s); forcing restart.
[telegram] sendMessage failed: Network request for 'sendMessage' failed!
```

Tight restart loops, agents appear unresponsive. First message after boot may sneak through, then nothing.

---

## Bug 4: Model fallback cascade — 30s per invalid candidate (`.24`, `.25`)

Non-retryable `400` errors (e.g. invalid model ID) take ~30s each before fallback fires. A 3-entry chain stalls for 90+ seconds. In `.23`, the same chain failed over in seconds. The 30s suggests an internal retry/timeout wrapping deterministic errors.

---

## Bug 5: Windows ESM regression (`.24`)

Telegram channel loader passes raw `C:\...` paths to Node 22's strict ESM loader:

```
Received protocol 'c:'
```

All agents fail to start.

---

## Environment

```
openclaw: tested .24, .25, .29 (all broken) — rolled back to .23 (working)
node: 22.21.1
os: Windows 11 Pro N (10.0.26200)
platform: win32
shell: bash (Git Bash via Pinokio)
bots: 8 Telegram accounts
plugins: acpx, bonjour, browser, device-pair, file-transfer, memory-core, phone-control, talk-voice, telegram
launcher: Pinokio (clawdbot)
```

## Notes

- Full gateway logs available on request for any version
- `.23` rollback immediately restores all functionality every time
- `doctor --fix` migrations (TTS edge→microsoft, streaming scalar→object, nano-banana→imageGenerationModel) applied cleanly on `.29` before the gateway boot failures
- The `.29` event loop issue reproduces on clean config (backup-restored `.23` config) — it's not caused by config migration artifacts
- `force-ipv4.js` (dns.lookup monkey-patch + `--dns-result-order=ipv4first`) confirmed loading in the gateway process — the fetch timeouts are from event loop starvation, not DNS

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Windows: Telegram unusable from .24 through .29 — event loop saturation, polling stalls, bonjour crash #75687

Summary

Bug 1: Event loop saturation kills all Telegram connections (`.29`)

What happens

Root cause theory

Reproduction

Suggested fix

Bug 2: Bonjour plugin crashes gateway on first boot (`.29`, Windows)

Bug 3: Telegram polling stalls and sendMessage failures (`.24`, `.25`)

Bug 4: Model fallback cascade — 30s per invalid candidate (`.24`, `.25`)

Bug 5: Windows ESM regression (`.24`)

Environment

Notes

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Windows: Telegram unusable from .24 through .29 — event loop saturation, polling stalls, bonjour crash #75687

Description

Summary

Bug 1: Event loop saturation kills all Telegram connections (.29)

What happens

Root cause theory

Reproduction

Suggested fix

Bug 2: Bonjour plugin crashes gateway on first boot (.29, Windows)

Bug 3: Telegram polling stalls and sendMessage failures (.24, .25)

Bug 4: Model fallback cascade — 30s per invalid candidate (.24, .25)

Bug 5: Windows ESM regression (.24)

Environment

Notes

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Bug 1: Event loop saturation kills all Telegram connections (`.29`)

Bug 2: Bonjour plugin crashes gateway on first boot (`.29`, Windows)

Bug 3: Telegram polling stalls and sendMessage failures (`.24`, `.25`)

Bug 4: Model fallback cascade — 30s per invalid candidate (`.24`, `.25`)

Bug 5: Windows ESM regression (`.24`)