Fix Telegram isolated polling stall watchdog#84861
Conversation
|
Codex review: needs maintainer review before merge. Workflow note: Future ClawSweeper reviews update this same comment in place. How this review workflow works
Summary Reproducibility: yes. Current main clearly passes pollingStallThresholdMs while defaulting isolated ingress on, but the threshold is only consumed by the non-isolated polling watchdog path. PR rating Rank-up moves:
What the crustacean ranks mean
Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics. Real behavior proof Mantis proof suggestion Risk before merge
Maintainer options:
Next step before merge Security Review detailsBest possible solution: Land the worker-liveness watchdog with the focused regression coverage once maintainers accept the session-level proof or obtain a live Telegram smoke for the transport path. Do we have a high-confidence way to reproduce the issue? Yes. Current main clearly passes pollingStallThresholdMs while defaulting isolated ingress on, but the threshold is only consumed by the non-isolated polling watchdog path. Is this the best way to solve the issue? Yes. Reusing the existing liveness tracker for isolated worker messages is a narrow fix for the documented config contract and avoids adding a new opt-out setting or changing the public config surface. Label changes:
Label justifications:
What I checked:
Likely related people:
Codex review notes: model gpt-5.5, reasoning high; reviewed against e42726204490. |
|
ClawSweeper PR egg ✨ Hatched: ✨ glimmer Gilded Shellbean Hatch commandComment Hatchability rules:
Rarity: ✨ glimmer. What is this egg doing here?
|
* fix(telegram): watch isolated polling stalls * docs(changelog): note telegram polling watchdog fix
* fix(telegram): watch isolated polling stalls * docs(changelog): note telegram polling watchdog fix
* fix(telegram): watch isolated polling stalls * docs(changelog): note telegram polling watchdog fix
* fix(telegram): watch isolated polling stalls * docs(changelog): note telegram polling watchdog fix
* fix(telegram): watch isolated polling stalls * docs(changelog): note telegram polling watchdog fix
* fix(telegram): watch isolated polling stalls * docs(changelog): note telegram polling watchdog fix
* fix(telegram): watch isolated polling stalls * docs(changelog): note telegram polling watchdog fix
* fix(telegram): watch isolated polling stalls * docs(changelog): note telegram polling watchdog fix
* fix(telegram): watch isolated polling stalls * docs(changelog): note telegram polling watchdog fix
* fix(telegram): watch isolated polling stalls * docs(changelog): note telegram polling watchdog fix
* fix(telegram): watch isolated polling stalls * docs(changelog): note telegram polling watchdog fix
Summary
pollingStallThresholdMsby wiring isolated worker activity into the liveness tracker and restarting silent workers.Fixes #83950
Verification
node scripts/run-vitest.mjs extensions/telegram/src/polling-session.test.ts -- --reporter=verbosenode scripts/run-vitest.mjs extensions/telegram/src/polling-liveness.test.ts -- --reporter=verboseAUTOREVIEW_AUTO_TESTS=0 .agents/skills/autoreview/scripts/autoreview --mode localtelegramReal behavior proof
Behavior addressed: Telegram isolated polling now honors
pollingStallThresholdMsand restarts a silent isolated ingress worker instead of wedging indefinitely.Real environment tested: Direct AWS Crabbox Linux. Runtime-level fix proof used provider=aws, leaseId=cbx_a81230ba78bf, run=run_0d6294cf4a6e. Credentialed Telegram proof used provider=aws, leaseId=cbx_d0af75929598, run=run_bc0c4ba2da67, with a Convex-leased
telegrambot credential.Exact steps or command run after this patch: First ran a no-credential
TelegramPollingSessionharness on AWS with isolated ingress enabled,pollingStallThresholdMs=30000, a fake Telegram API fetch, and a silent worker task, then observed for 65s before explicit abort. Then ran a credentialed AWS harness that leasedtelegramfrom Convex, validated the SUT bot token against the real Telegram Bot API withgetMe,deleteWebhook, andgetUpdates, forwarded one livegetUpdatesthrough a local API proxy, induced a silent hunggetUpdates, and observed the isolated polling watchdog restart the ingress.Evidence after fix:
OPENCLAW_ISSUE_83950_FIX_PROOFbeforeAbort showedstopCount=1,workerFactoryCalls=2, and stall log lines includingPolling stall detected...plus the isolated ingress restart log.OPENCLAW_ISSUE_83950_CREDENTIALED_LIVE_FIX_PROOFshowedtokenValidated=true, real API preflight methodsgetMe,deleteWebhook, andgetUpdates,forwardedGetUpdates=1,hungGetUpdates=2,workerFactoryCalls=2,stopCount=1before abort, and stall/restart logs.Observed result after fix: Silent worker was stopped and restarted before explicit abort in both proofs; the credentialed live proof exited 0 after the watchdog detected
active getUpdates stuckand restarted isolated ingress.What was not tested: A naturally occurring Telegram production outage was not waited on, and no
telegram-userhuman/Desktop credential was used. The credentialed proof used the available Convextelegrambot credential and a controlled local proxy-inducedgetUpdateshang after validating the real bot token against Telegram.