fix(gateway): drain stale httpx polling connections on Telegram reconnect by kshitijk4poor · Pull Request #17015 · NousResearch/hermes-agent

kshitijk4poor · 2026-04-28T13:21:46Z

Summary

Salvage of #16466 by @Mirac1eSky — drains stale httpx connections during Telegram polling reconnect to prevent pool exhaustion through proxy-related network errors.

What the original PR identified

When Telegram polling drops through a proxy (e.g. sing-box), updater.stop() + start_polling() leaves the underlying httpx connections in a half-closed state. After enough cycles the default 256-connection pool fills up, causing:

Pool timeout: All connections in the connection pool are occupied.

What changed from the original

The original PR called bot.shutdown() + bot.initialize() which cycles both httpx connection pools:

_request[0] — getUpdates (polling only)
_request[1] — general (send_message, edit_message, etc.)

This creates a race condition: any concurrent send_message/edit_message call hitting _request[1] between shutdown and re-initialize gets RuntimeError("This HTTPXRequest is not initialized!"). Additionally, bot.initialize() calls get_me() (a network round-trip) which is likely to fail during network error recovery.

Our fix targets only _request[0] (the polling request) via HTTPXRequest.shutdown() + HTTPXRequest.initialize() directly. The general request is never touched, so concurrent message sends are safe. No get_me() call is made.

Additional improvements over original

Aspect	Original PR	This salvage
Scope	`bot.shutdown()` — kills all connections	`_request[0]` only — polling connections only
Race condition	Yes — concurrent sends fail with RuntimeError	None — general request untouched
Network call	`get_me()` during initialize (likely fails)	No network call — just rebuilds httpx client
Coverage	`_handle_polling_network_error` only	Both `_handle_polling_network_error` AND `_handle_polling_conflict`
Code structure	Inline in one method	Shared `_drain_polling_connections()` helper
Fault isolation	Single try block — shutdown failure skips initialize	Separate try blocks — initialize always attempted even if shutdown fails
Diagnosability	Silent `except: pass`	DEBUG-level logging with exc_info on failures
PTB coupling	Undocumented	Docstring notes PTB 22.x `_request` tuple structure, flags for PTB 23+ review

Tests

9 tests pass (4 existing + 5 new):

test_reconnect_drains_polling_request_only — verifies only _request[0] is cycled, _request[1] untouched
test_reconnect_continues_if_drain_fails — both shutdown + initialize fail, reconnect still proceeds
test_initialize_still_runs_when_shutdown_fails — shutdown raises but initialize is still called (separate try blocks)
test_conflict_retry_also_drains_polling_connections — conflict path also drains
test_drain_helper_noop_without_app — graceful no-op when app is None

E2E validated with realistic MockHTTPXRequest objects tracking shutdown/initialize state.

Full gateway suite: 3844 passed, 61 skipped, 13 failed (all 13 failures pre-existing on main).

Files changed

gateway/platforms/telegram.py — new _drain_polling_connections() helper + calls in both reconnect paths (+40 lines)
tests/gateway/test_telegram_network_reconnect.py — 5 new tests (+120 lines)
scripts/release.py — add Mirac1eSky to AUTHOR_MAP (+1 line)

alt-glitch · 2026-04-28T13:25:37Z

Improved salvage of #16466 — targets only the polling request object to avoid race conditions with concurrent sends. Supersedes #16466.

@Mirac1eSky

…nect Network errors through proxies (e.g. sing-box) can leave httpx connections in a half-closed state occupying pool slots. After enough reconnect cycles the 256-connection default fills up entirely, causing Pool timeout: All connections in the connection pool are occupied. Fix: cycle only the getUpdates request object (_request[0]) via shut-down + re-initialize before restarting polling. This drains stale connections without touching the general request (_request[1]) that concurrent send_message / edit_message calls rely on. The drain is applied to both _handle_polling_network_error and _handle_polling_conflict reconnect paths via a shared _drain_polling_connections() helper. Failures in the drain are swallowed so reconnect always proceeds. Based on #16466 by @Mirac1eSky.

alt-glitch added type/bug Something isn't working P1 High — major feature broken, no workaround comp/gateway Gateway runner, session dispatch, delivery platform/telegram Telegram bot adapter labels Apr 28, 2026

Mirac1eSky and others added 2 commits April 28, 2026 19:05

chore: add Mirac1eSky to AUTHOR_MAP

8f0e2b8

kshitijk4poor force-pushed the salvage/telegram-pool-drain branch from 2bf8240 to 8f0e2b8 Compare April 28, 2026 13:35

kshitijk4poor merged commit b5905f0 into main Apr 28, 2026
11 of 12 checks passed

kshitijk4poor deleted the salvage/telegram-pool-drain branch April 28, 2026 13:37

kshitijk4poor mentioned this pull request Apr 28, 2026

fix(gateway): drain stale httpx connections on Telegram polling reconnect #16466

Closed

JustinHuber mentioned this pull request May 24, 2026

Telegram adapter leaks httpx general-pool connections through HTTP proxy (CLOSED sockets accumulate, fd limit hit after ~2 days) #31599

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(gateway): drain stale httpx polling connections on Telegram reconnect#17015

fix(gateway): drain stale httpx polling connections on Telegram reconnect#17015
kshitijk4poor merged 2 commits into
mainfrom
salvage/telegram-pool-drain

kshitijk4poor commented Apr 28, 2026 •

edited

Loading

Uh oh!

alt-glitch commented Apr 28, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

kshitijk4poor commented Apr 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What the original PR identified

What changed from the original

Additional improvements over original

Tests

Files changed

Uh oh!

alt-glitch commented Apr 28, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

kshitijk4poor commented Apr 28, 2026 •

edited

Loading