Bug type
Behavior bug (incorrect output/state without crash)
Beta release blocker
No
Summary
channels.telegram.pollingStallThresholdMs is silently ignored when the isolated polling ingress is active (the default since 2026.5.12). The setting is read, validated, clamped, stored on TelegramPollingSession, and then never consulted in the isolated code path — only #runPollingCycle uses it. On a hosted gateway this manifests as a configured "5-minute polling stall watchdog" that never fires.
Steps to reproduce
Source-level, against main (extensions/telegram/src/polling-session.ts, ~SHA 8bd24ad):
- Set
channels.telegram.pollingStallThresholdMs: 300000 in openclaw.json.
- Start the gateway with default Telegram polling (no webhook, no explicit
isolatedIngress.enabled = false — none is exposed in config or CLI today; see MonitorTelegramOpts and monitorTelegramProvider call site in extensions/telegram/src/channel.ts / probe.ts).
TelegramPollingSession.runUntilAbort reaches #runIsolatedIngressCycle(bot) because this.opts.isolatedIngress?.enabled defaults to true in probe.ts:
isolatedIngress: {
enabled: opts.isolatedIngress?.enabled ?? true,
// ...
},
- Grep
#runIsolatedIngressCycle for this.#stallThresholdMs — it is not referenced anywhere in the isolated cycle. The watchdog setInterval(...) that calls liveness.detectStall({ thresholdMs: this.#stallThresholdMs }) lives only in #runPollingCycle.
Live(ish) corroboration on 2026.5.12 (f066dd2): after an upstream IPv4-only sticky fetch dispatcher swap recovered, our isolated ingress went silent for ~11 minutes (no poll-success, no poll-error, no log line). update-offset-default.json mtime stayed frozen at the moment of the swap; pending_update_count from getWebhookInfo rose from 0 → 1 when the user sent a fresh DM; no stall log was emitted; systemctl --user restart openclaw-gateway recovered it instantly. The 11-min wedge here may itself be #83272 / #83505 (drain side, fixed in 2026.5.19) rather than the worker, so I am intentionally not filing this as the same incident — this issue is the source-level fact that the configured watchdog setting is silently ignored regardless of which side wedged.
Expected behavior
One of:
channels.telegram.pollingStallThresholdMs is honored by the isolated path — e.g. a parent-side timer on the worker.task() await that forces a worker.stop() + cycle restart when no poll-start / poll-success / poll-error / spooled message has arrived from the worker for stallThresholdMs, or
- The setting is documented to apply only when isolated ingress is disabled, and
channels.telegram.isolatedIngress.enabled (or equivalent) is exposed in config / CLI / MonitorTelegramOpts so operators can opt out and recover the existing watchdog semantics from #runPollingCycle.
Actual behavior
TelegramPollingSession constructor:
this.#stallThresholdMs = resolvePollingStallThresholdMs(opts.stallThresholdMs);
is invoked, the value is clamped to [30_000, 600_000], then in runUntilAbort:
if (
(this.opts.isolatedIngress?.enabled
? await this.#runIsolatedIngressCycle(bot)
: await this.#runPollingCycle(bot)) === "exit"
) return;
Only the embedded branch installs the watchdog:
const watchdog = setInterval(() => {
if (this.opts.abortSignal?.aborted) return;
const stall = liveness.detectStall({ thresholdMs: this.#stallThresholdMs });
// ...
}, POLL_WATCHDOG_INTERVAL_MS);
The isolated branch awaits worker.task() with no equivalent timer. Even after #83505 (drain-side tombstones in 2026.5.19), no code path consumes this.#stallThresholdMs if the worker side ever becomes silently unresponsive (e.g. event-loop starvation in the worker, OS-level socket wedge, or any future inactivity mode that bot.api.config.use style instrumentation does not cover).
Net effect: an operator who sets pollingStallThresholdMs: 300000 (or whatever) gets the same behavior as if the field were absent. No warning, no doctor diagnostic, no log line. The TUI / Web UI and openclaw status likewise present no signal that the setting is inert.
OpenClaw version
2026.5.12 (f066dd2) (install observed). Source review against main at commit 8bd24ad confirms the gap is still present in 2026.5.19 / 5.20-betas — the recently-shipped #83505 fix added drain-side recovery but did not wire this.#stallThresholdMs into #runIsolatedIngressCycle.
Operating system
Linux 6.17 (Ubuntu user systemd)
Install method
npm i -g openclaw
Node.js version
(Whatever the global install of 2026.5.12 runs against — not relevant; this is a source-level claim.)
Channel(s) affected
- Telegram (polling mode, isolated ingress = default-on)
Logs
The relevant evidence is structural: grep -n "this.#stallThresholdMs" extensions/telegram/src/polling-session.ts returns exactly two hits — the constructor assignment and the #runPollingCycle watchdog. No isolated-cycle hit. That is the whole bug.
Proposed fix shape
Minimal, low-risk: in #runIsolatedIngressCycle, track lastWorkerMessageAt on each poll-start/poll-success/poll-error/spooled event; install a setInterval(POLL_WATCHDOG_INTERVAL_MS) that calls worker.stop() and returns "continue" from the cycle when Date.now() - lastWorkerMessageAt > this.#stallThresholdMs (mirroring the embedded path's stalledRestart handling and forceCycleTimer grace).
Belt-and-suspenders: also expose channels.telegram.isolatedIngress.enabled in the schema and monitorTelegramProvider call chain so the embedded path remains accessible to operators who prefer the older, single-threaded behavior.
Additional context
A local external systemd timer + getWebhookInfo + update-offset-default.json mtime check is a workable mitigation today and is what I've deployed; happy to share that script if useful, but a built-in honor of the already-documented config is the right durable fix.
Bug type
Behavior bug (incorrect output/state without crash)
Beta release blocker
No
Summary
channels.telegram.pollingStallThresholdMsis silently ignored when the isolated polling ingress is active (the default since 2026.5.12). The setting is read, validated, clamped, stored onTelegramPollingSession, and then never consulted in the isolated code path — only#runPollingCycleuses it. On a hosted gateway this manifests as a configured "5-minute polling stall watchdog" that never fires.Steps to reproduce
Source-level, against
main(extensions/telegram/src/polling-session.ts, ~SHA8bd24ad):channels.telegram.pollingStallThresholdMs: 300000inopenclaw.json.isolatedIngress.enabled = false— none is exposed in config or CLI today; seeMonitorTelegramOptsandmonitorTelegramProvidercall site inextensions/telegram/src/channel.ts/probe.ts).TelegramPollingSession.runUntilAbortreaches#runIsolatedIngressCycle(bot)becausethis.opts.isolatedIngress?.enableddefaults totrueinprobe.ts:#runIsolatedIngressCycleforthis.#stallThresholdMs— it is not referenced anywhere in the isolated cycle. The watchdogsetInterval(...)that callsliveness.detectStall({ thresholdMs: this.#stallThresholdMs })lives only in#runPollingCycle.Live(ish) corroboration on 2026.5.12 (
f066dd2): after an upstream IPv4-only sticky fetch dispatcher swap recovered, our isolated ingress went silent for ~11 minutes (nopoll-success, nopoll-error, no log line).update-offset-default.jsonmtime stayed frozen at the moment of the swap;pending_update_countfromgetWebhookInforose from 0 → 1 when the user sent a fresh DM; no stall log was emitted;systemctl --user restart openclaw-gatewayrecovered it instantly. The 11-min wedge here may itself be #83272 / #83505 (drain side, fixed in 2026.5.19) rather than the worker, so I am intentionally not filing this as the same incident — this issue is the source-level fact that the configured watchdog setting is silently ignored regardless of which side wedged.Expected behavior
One of:
channels.telegram.pollingStallThresholdMsis honored by the isolated path — e.g. a parent-side timer on theworker.task()await that forces aworker.stop()+ cycle restart when nopoll-start/poll-success/poll-error/spooledmessage has arrived from the worker forstallThresholdMs, orchannels.telegram.isolatedIngress.enabled(or equivalent) is exposed in config / CLI /MonitorTelegramOptsso operators can opt out and recover the existing watchdog semantics from#runPollingCycle.Actual behavior
TelegramPollingSessionconstructor:is invoked, the value is clamped to
[30_000, 600_000], then inrunUntilAbort:Only the embedded branch installs the watchdog:
The isolated branch awaits
worker.task()with no equivalent timer. Even after #83505 (drain-side tombstones in 2026.5.19), no code path consumesthis.#stallThresholdMsif the worker side ever becomes silently unresponsive (e.g. event-loop starvation in the worker, OS-level socket wedge, or any future inactivity mode thatbot.api.config.usestyle instrumentation does not cover).Net effect: an operator who sets
pollingStallThresholdMs: 300000(or whatever) gets the same behavior as if the field were absent. No warning, no doctor diagnostic, no log line. The TUI / Web UI andopenclaw statuslikewise present no signal that the setting is inert.OpenClaw version
2026.5.12 (f066dd2)(install observed). Source review againstmainat commit8bd24adconfirms the gap is still present in 2026.5.19 / 5.20-betas — the recently-shipped #83505 fix added drain-side recovery but did not wirethis.#stallThresholdMsinto#runIsolatedIngressCycle.Operating system
Linux 6.17 (Ubuntu user systemd)
Install method
npm i -g openclawNode.js version
(Whatever the global install of
2026.5.12runs against — not relevant; this is a source-level claim.)Channel(s) affected
Logs
The relevant evidence is structural:
grep -n "this.#stallThresholdMs" extensions/telegram/src/polling-session.tsreturns exactly two hits — the constructor assignment and the#runPollingCyclewatchdog. No isolated-cycle hit. That is the whole bug.Proposed fix shape
Minimal, low-risk: in
#runIsolatedIngressCycle, tracklastWorkerMessageAton eachpoll-start/poll-success/poll-error/spooledevent; install asetInterval(POLL_WATCHDOG_INTERVAL_MS)that callsworker.stop()and returns"continue"from the cycle whenDate.now() - lastWorkerMessageAt > this.#stallThresholdMs(mirroring the embedded path'sstalledRestarthandling andforceCycleTimergrace).Belt-and-suspenders: also expose
channels.telegram.isolatedIngress.enabledin the schema andmonitorTelegramProvidercall chain so the embedded path remains accessible to operators who prefer the older, single-threaded behavior.Additional context
A local external systemd timer +
getWebhookInfo+update-offset-default.jsonmtime check is a workable mitigation today and is what I've deployed; happy to share that script if useful, but a built-in honor of the already-documented config is the right durable fix.