Bug Summary
When platform adapters lose network connectivity and exhaust their internal reconnection retries, they silently stop without notifying the gateway's fatal-error handling mechanism. This leaves the gateway process alive but the platform permanently unresponsive.
Observed Behavior
06:38 DNS resolution fails
06:38 Telegram starts retrying (1/10, 2/10, ...)
06:38 QQBot starts retrying (1/100, 2/100, ...)
07:33 Both adapters exhaust retries and stop
07:33 No more platform activity; gateway process still alive
Root Cause
File: gateway/platforms/qqbot/adapter.py
At lines ~608 and ~620, when attempt >= MAX_RECONNECT_ATTEMPTS:
# Current (buggy):
self._mark_disconnected()
# Should be:
self._set_fatal_error(f"QQBot reconnect exhausted after {MAX_RECONNECT_ATTEMPTS} attempts")
await self._notify_fatal_error()
Without _notify_fatal_error(), the gateway's _platform_reconnect_watcher() never attempts to restart the QQBot adapter.
File: gateway/platforms/telegram.py
During _handle_polling_network_error(), the runtime status remains connected during retries 1-10. The fatal error is only set at attempt 11. Consider writing a retrying state during the retry window so external monitors can detect degraded connectivity earlier.
Suggested Fixes
| Priority |
File |
Change |
| P0 |
qqbot/adapter.py |
On reconnect exhaustion: _set_fatal_error() + _notify_fatal_error() |
| P1 |
telegram.py |
Write retrying state during network error retries |
Environment
- hermes-agent version: latest main (May 20, 2026)
- OS: macOS 26.5
- Python: 3.12
Bug Summary
When platform adapters lose network connectivity and exhaust their internal reconnection retries, they silently stop without notifying the gateway's fatal-error handling mechanism. This leaves the gateway process alive but the platform permanently unresponsive.
Observed Behavior
Root Cause
File:
gateway/platforms/qqbot/adapter.pyAt lines ~608 and ~620, when
attempt >= MAX_RECONNECT_ATTEMPTS:Without
_notify_fatal_error(), the gateway's_platform_reconnect_watcher()never attempts to restart the QQBot adapter.File:
gateway/platforms/telegram.pyDuring
_handle_polling_network_error(), the runtime status remainsconnectedduring retries 1-10. The fatal error is only set at attempt 11. Consider writing aretryingstate during the retry window so external monitors can detect degraded connectivity earlier.Suggested Fixes
qqbot/adapter.py_set_fatal_error()+_notify_fatal_error()telegram.pyretryingstate during network error retriesEnvironment