Skip to content

Telegram: stop bot on polling teardown#31141

Merged
steipete merged 1 commit intoopenclaw:mainfrom
liuxiaopai-ai:codex/fix-telegram-restart-poll-leak
Mar 2, 2026
Merged

Telegram: stop bot on polling teardown#31141
steipete merged 1 commit intoopenclaw:mainfrom
liuxiaopai-ai:codex/fix-telegram-restart-poll-leak

Conversation

@liuxiaopai-ai
Copy link
Copy Markdown
Contributor

Summary

Describe the problem and fix in 2–5 bullets:

  • Problem: in-process SIGUSR1 restarts could leave old Telegram long-poll internals alive long enough to conflict with the next polling cycle (getUpdates 409 storms).
  • Why it matters: overlapping poll loops can cause unstable inbound routing and repeated conflict backoff after config-triggered restart.
  • What changed: when a polling cycle exits, monitor teardown now awaits both runner.stop() and bot.stop() before continuing/restarting.
  • What did NOT change (scope boundary): no Telegram routing logic changes, no retry-policy changes, and no restart scheduler changes.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

User-visible / Behavior Changes

  • Telegram polling restart/teardown is stricter: bot instance is explicitly stopped at polling-cycle exit, reducing post-restart getUpdates conflicts.

Security Impact (required)

  • New permissions/capabilities? (No)
  • Secrets/tokens handling changed? (No)
  • New/changed network calls? (No)
  • Command/tool execution surface changed? (No)
  • Data access scope changed? (No)
  • If any Yes, explain risk + mitigation:

Repro + Verification

Environment

  • OS: macOS (dev)
  • Runtime/container: Node 22+, pnpm
  • Model/provider: n/a
  • Integration/channel (if any): Telegram polling monitor
  • Relevant config (redacted): n/a

Steps

  1. Start Telegram monitor polling loop.
  2. Trigger restart/abort path for the polling cycle.
  3. Verify bot teardown occurs before next cycle progression.

Expected

  • Polling cycle exit always stops both runner and bot instance.

Actual

  • Before fix: only runner stop was awaited.
  • After fix: runner + bot stop both awaited during polling teardown.

Evidence

Attach at least one:

  • Failing test/log before + passing after

  • Trace/log snippets

  • Screenshot/recording

  • Perf numbers (if relevant)

  • pnpm test src/telegram/monitor.test.ts

  • pnpm check

Human Verification (required)

What you personally verified (not just CI), and how:

  • Verified scenarios:
    • Added regression test that monitor teardown calls bot stop() when polling cycle exits.
    • Existing monitor restart tests still pass, including recoverable-error restart behavior.
  • Edge cases checked:
    • Teardown keeps fail-safe behavior when stop paths are already completed.
  • What you did not verify:
    • Live Telegram Bot API end-to-end on a production token under repeated config.patch restarts.

Compatibility / Migration

  • Backward compatible? (Yes)
  • Config/env changes? (No)
  • Migration needed? (No)
  • If yes, exact upgrade steps:

Failure Recovery (if this breaks)

  • How to disable/revert this change quickly:
    • Revert this commit.
  • Files/config to restore:
    • src/telegram/monitor.ts
    • src/telegram/monitor.test.ts
  • Known bad symptoms reviewers should watch for:
    • Telegram polling not restarting after recoverable failures.

Risks and Mitigations

  • Risk:
    • Extra teardown step could expose latent bot.stop implementation quirks in edge environments.
    • Mitigation:
      • bot.stop() is wrapped fail-safe and does not crash teardown if already stopped.

@openclaw-barnacle openclaw-barnacle Bot added channel: telegram Channel integration: telegram size: XS labels Mar 2, 2026
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Mar 2, 2026

Greptile Summary

This PR fixes a polling teardown race condition where in-process restarts (SIGUSR1) could leave old Telegram long-poll internals alive, causing getUpdates 409 conflicts. The fix ensures bot.stop() is called alongside runner.stop() in the polling cycle's finally block. The implementation follows the existing stopRunner() pattern with proper error handling and includes regression test coverage.

Confidence Score: 5/5

  • This PR is safe to merge with minimal risk
  • The change is minimal, well-isolated, and follows existing teardown patterns. Error handling properly catches cases where the bot is already stopped. The test coverage verifies the new behavior, and the fix directly addresses the described 409 conflict issue without modifying any routing or retry logic.
  • No files require special attention

Last reviewed commit: 2ea44da

@steipete steipete force-pushed the codex/fix-telegram-restart-poll-leak branch from 2ea44da to 53f2035 Compare March 2, 2026 02:09
@steipete steipete merged commit 042d06a into openclaw:main Mar 2, 2026
9 checks passed
@steipete
Copy link
Copy Markdown
Contributor

steipete commented Mar 2, 2026

Landed via temp rebase onto main.

  • Gate: targeted tests (speed mode): pnpm vitest run src/telegram/monitor.test.ts
  • Land commit: 042d06a
  • Merge commit: 042d06a

Thanks @liuxiaopai-ai!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

channel: telegram Channel integration: telegram size: XS

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Telegram provider leaks polling loop on SIGUSR1/graceful restart

2 participants