Skip to content

fix(gateway): retry startup auto-resume when a failed platform reconnects#39018

Merged
teknium1 merged 1 commit into
mainfrom
hermes/hermes-4652f229
Jun 4, 2026
Merged

fix(gateway): retry startup auto-resume when a failed platform reconnects#39018
teknium1 merged 1 commit into
mainfrom
hermes/hermes-4652f229

Conversation

@teknium1

@teknium1 teknium1 commented Jun 4, 2026

Copy link
Copy Markdown
Contributor

Summary

A messaging platform that was offline at gateway startup now gets its restart-interrupted sessions auto-resumed the moment it reconnects, instead of waiting for a manual user message.

Root cause: the documented startup auto-resume (_schedule_resume_pending_sessions()) runs once at startup and silently skips any session whose adapter isn't connected yet. Those sessions were never rescheduled, so a late-connecting platform's interrupted sessions only recovered when the user sent a fresh message.

Changes

  • gateway/run.py_schedule_resume_pending_sessions() gains an optional platform filter and an in-flight guard (session_key in self._running_agents) so a session can't be resumed twice.
  • gateway/run.py — the reconnect-watcher success path re-runs the auto-resume scoped to the reconnected platform (wrapped in try/except so a reschedule failure can't break reconnect).
  • tests/gateway/ — 3 new regression tests (late reconnect reschedules, platform-scoping, running-agent skip).

Validation

  • Targeted suite: 98/98 pass (tests/gateway/test_restart_resume_pending.py, tests/gateway/test_platform_reconnect.py).
  • Premise confirmed against current main: startup pass is one-shot and skips disconnected adapters.

Salvage of #37669 by @Frowtek — cherry-picked onto current main, authorship preserved. Closes #37669.

Infographic

gateway-auto-resume-on-reconnect

@github-actions

github-actions Bot commented Jun 4, 2026

Copy link
Copy Markdown
Contributor

🔎 Lint report: hermes/hermes-4652f229 vs origin/main

ruff

Total: 0 on HEAD, 0 on base (➖ 0)

🆕 New issues: none

✅ Fixed issues: none

Unchanged: 0 pre-existing issues carried over.

ty (type checker)

Total: 9801 on HEAD, 9792 on base (🆕 +9)

🆕 New issues: none

✅ Fixed issues: none

Unchanged: 5085 pre-existing issues carried over.

Diagnostics are surfaced as warnings — this check never fails the build.

@teknium1 teknium1 merged commit 71a9f44 into main Jun 4, 2026
23 checks passed
@teknium1 teknium1 deleted the hermes/hermes-4652f229 branch June 4, 2026 12:56
@alt-glitch alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists comp/gateway Gateway runner, session dispatch, delivery labels Jun 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/gateway Gateway runner, session dispatch, delivery P2 Medium — degraded but workaround exists type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants