Summary
When the Telegram API becomes unreachable (network blip, DNS timeout, etc.), the typing indicator keepalive loop (createTypingKeepaliveLoop in src/channels/typing-lifecycle.ts) continues firing sendChatAction calls every 6 seconds indefinitely. Each failed call triggers up to 3 retries with exponential backoff (up to 30s). Multiple concurrent typing contexts compound this, saturating the event loop and causing the gateway to become unresponsive.
This results in:
lane wait exceeded: waitedMs=317017 (5+ minutes blocked)
- launchd sends SIGTERM due to unresponsiveness
- Gateway crash loop (observed 3 crashes in one morning)
Reproduction
- Start the gateway with Telegram enabled
- Trigger a message that starts a typing indicator loop
- Interrupt network connectivity (e.g., disable WiFi, block api.telegram.org)
- Observe
sendChatAction failed: Network request failed! every 3 seconds in gateway.err.log
- Gateway eventually becomes unresponsive and receives SIGTERM
Root Cause
In src/channels/typing-lifecycle.ts, createTypingKeepaliveLoop uses setInterval with no max-consecutive-error check:
const tick = async () => {
if (tickInFlight) return;
tickInFlight = true;
try {
await params.onTick();
} finally {
tickInFlight = false;
}
};
const start = () => {
if (params.intervalMs <= 0 || timer) return;
timer = setInterval(() => { tick(); }, params.intervalMs);
};
The typingTtlMs (2 min) should theoretically stop the loop, but when network errors stall the event loop, the TTL timer doesn't fire cleanly.
Meanwhile withTelegramApiErrorLogging in src/telegram/send.ts logs the error then re-throws, but the caller (sendTypingTelegram) uses createTelegramRequestWithDiag which retries recoverable network errors up to 3 times with 30s max backoff — so each tick can block for ~90s on a dead network.
Suggested Fix
Add a consecutive-error circuit breaker to createTypingKeepaliveLoop:
let consecutiveErrors = 0;
const MAX_CONSECUTIVE_ERRORS = 3;
const tick = async () => {
if (tickInFlight) return;
tickInFlight = true;
try {
await params.onTick();
consecutiveErrors = 0;
} catch {
consecutiveErrors++;
if (consecutiveErrors >= MAX_CONSECUTIVE_ERRORS) {
stop();
params.onCircuitBreak?.();
}
} finally {
tickInFlight = false;
}
};
Alternatively, typing indicator calls specifically could use attempts: 1 and a short timeout (5s) since they're purely cosmetic — a failed typing indicator should never block real message processing.
Workaround
Setting agents.defaults.typingMode: "never" in openclaw.json eliminates the crash vector entirely. Additionally reducing channels.telegram.retry.attempts to 1 and timeoutSeconds to 5 limits blast radius.
Environment
- OpenClaw v2026.3.11
- macOS (launchd managed gateway)
- Telegram channel with multiple group topics
Summary
When the Telegram API becomes unreachable (network blip, DNS timeout, etc.), the typing indicator keepalive loop (
createTypingKeepaliveLoopinsrc/channels/typing-lifecycle.ts) continues firingsendChatActioncalls every 6 seconds indefinitely. Each failed call triggers up to 3 retries with exponential backoff (up to 30s). Multiple concurrent typing contexts compound this, saturating the event loop and causing the gateway to become unresponsive.This results in:
lane wait exceeded: waitedMs=317017(5+ minutes blocked)Reproduction
sendChatAction failed: Network request failed!every 3 seconds ingateway.err.logRoot Cause
In
src/channels/typing-lifecycle.ts,createTypingKeepaliveLoopusessetIntervalwith no max-consecutive-error check:The
typingTtlMs(2 min) should theoretically stop the loop, but when network errors stall the event loop, the TTL timer doesn't fire cleanly.Meanwhile
withTelegramApiErrorLogginginsrc/telegram/send.tslogs the error then re-throws, but the caller (sendTypingTelegram) usescreateTelegramRequestWithDiagwhich retries recoverable network errors up to 3 times with 30s max backoff — so each tick can block for ~90s on a dead network.Suggested Fix
Add a consecutive-error circuit breaker to
createTypingKeepaliveLoop:Alternatively, typing indicator calls specifically could use
attempts: 1and a short timeout (5s) since they're purely cosmetic — a failed typing indicator should never block real message processing.Workaround
Setting
agents.defaults.typingMode: "never"inopenclaw.jsoneliminates the crash vector entirely. Additionally reducingchannels.telegram.retry.attemptsto1andtimeoutSecondsto5limits blast radius.Environment