Telegram long-polling stalls permanently after NAT drops idle TCP connections — undici connection pool not recycled on restart

## Description

When running openclaw gateway behind NAT/cloud firewall, Telegram long-polling periodically stalls and **cannot recover** without a full process restart. The gateway correctly detects the stall ("no getUpdates response for 90s") and attempts to restart polling, but the restart reuses the same undici HTTP dispatcher, which still holds dead keep-alive connections in its pool.

## Environment

- openclaw 2026.3.13
- Node.js 22 (uses undici as default HTTP client)
- VPS behind cloud NAT (idle TCP connections silently dropped after ~5-10 min)
- Telegram channel with `getUpdates` long-polling (30s timeout)

## Steps to Reproduce

1. Run openclaw gateway with Telegram channel on a VPS behind NAT/firewall
2. Wait for an idle period where no Telegram messages arrive for 5-10+ minutes
3. NAT silently drops the idle TCP connection
4. Next `getUpdates` request hangs indefinitely on the dead connection
5. Gateway detects stall after 90s and restarts polling
6. Restart reuses the same undici dispatcher → new requests go through the same dead connection pool
7. Gateway enters infinite failure loop: `Network request for 'sendChatAction' failed!` / `Network request for 'getUpdates' failed!`

## Log Pattern

```
[WARN] No getUpdates response for 90 seconds. Restarting polling...
[ERROR] Network request for 'getUpdates' failed!
[ERROR] Network request for 'sendChatAction' failed!
[ERROR] Network request for 'getUpdates' failed!
... (repeats indefinitely until process kill)
```

## Root Cause Analysis

- Node.js 22's undici maintains a keep-alive connection pool per dispatcher
- When NAT drops the underlying TCP connection, undici doesn't detect it (no SO_KEEPALIVE, no application-level ping)
- `polling-session.ts` restarts polling but reuses the same dispatcher instance, so new requests are routed through the same pool of dead connections
- **Discord doesn't have this problem** because it uses WebSocket with heartbeat frames, which detect dead connections and trigger automatic reconnection

## Suggested Fix

When restarting Telegram polling after a stall, create a **new undici dispatcher** (or explicitly close/drain the old one) so the connection pool is clean:

```ts
// In polling-session.ts restart logic:
if (this.dispatcher) {
  await this.dispatcher.close();  // drain existing pool
}
this.dispatcher = new undici.Agent({ /* fresh pool */ });
```

Alternatively, configure the undici pool with shorter `keepAliveTimeout` or `keepAliveMaxTimeout` to proactively evict idle connections before NAT drops them.

## Current Workaround

Adding `retry` config to the Telegram channel helps (as mentioned in #7526), but it's a band-aid — the connection pool still holds dead connections, and each retry attempt may hit the same dead socket before undici eventually opens a new one.

```json
"retry": {
  "attempts": 5,
  "minDelayMs": 1000,
  "maxDelayMs": 10000,
  "jitter": 0.3
}
```

A process-level watchdog that force-restarts the gateway is another workaround, but neither addresses the root cause.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Telegram long-polling stalls permanently after NAT drops idle TCP connections — undici connection pool not recycled on restart #48029

Description

Environment

Steps to Reproduce

Log Pattern

Root Cause Analysis

Suggested Fix

Current Workaround

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Telegram long-polling stalls permanently after NAT drops idle TCP connections — undici connection pool not recycled on restart #48029

Description

Description

Environment

Steps to Reproduce

Log Pattern

Root Cause Analysis

Suggested Fix

Current Workaround

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions