Description
When the QQ Bot WebSocket gateway connection drops and cannot be re-established within MAX_RECONNECT_ATTEMPTS (100 attempts), the _listen_loop() in gateway/platforms/qqbot/adapter.py calls return silently — it does not notify the GatewayRunner that the adapter has died.
This causes the gateway to believe the QQ platform is still "connected" (the last _mark_connected() state persists), while the adapter is actually dead. The gateway process keeps running, but no messages are received or sent via QQ.
The systemd Restart=on-failure policy never triggers because the process does not exit.
Steps to reproduce
- Run Hermes gateway with QQ bot enabled
- QQ WebSocket disconnects (network block or server-side disconnect)
- Reconnect attempts exhaust after ~100 × 60s backoff loop (~100 minutes)
_listen_loop() returns without error — adapter is dead but gateway does not know
- Gateway shows
qqbot: connected in gateway_state.json despite being dead
Expected behavior
When the QQ adapter reconnect loop exhausts, it should call self._set_fatal_error() or otherwise notify the GatewayRunner so that:
- The platform state is marked as
disconnected or fatal
- The gateway runner marks the adapter as dead
- (Optionally) the gateway exits so systemd can restart it
Root cause
In gateway/platforms/qqbot/adapter.py, _listen_loop():
if backoff_idx >= MAX_RECONNECT_ATTEMPTS:
logger.error("[%s] Max reconnect attempts reached", self._log_tag)
return # ← Silent return, gateway runner not notified
No call to _set_fatal_error() or equivalent to propagate the failure upward.
Environment
- Hermes Agent version: main (latest, b7e71fb)
- QQ Bot adapter version: 1.1.0
- Platform: Linux, systemd user service
- QQ Bot WebSocket reconnect backoff: [2, 5, 10, 30, 60] seconds
- MAX_RECONNECT_ATTEMPTS: 100
Description
When the QQ Bot WebSocket gateway connection drops and cannot be re-established within
MAX_RECONNECT_ATTEMPTS(100 attempts), the_listen_loop()ingateway/platforms/qqbot/adapter.pycallsreturnsilently — it does not notify theGatewayRunnerthat the adapter has died.This causes the gateway to believe the QQ platform is still "connected" (the last
_mark_connected()state persists), while the adapter is actually dead. The gateway process keeps running, but no messages are received or sent via QQ.The
systemd Restart=on-failurepolicy never triggers because the process does not exit.Steps to reproduce
_listen_loop()returns without error — adapter is dead but gateway does not knowqqbot: connectedingateway_state.jsondespite being deadExpected behavior
When the QQ adapter reconnect loop exhausts, it should call
self._set_fatal_error()or otherwise notify theGatewayRunnerso that:disconnectedorfatalRoot cause
In
gateway/platforms/qqbot/adapter.py,_listen_loop():No call to
_set_fatal_error()or equivalent to propagate the failure upward.Environment