Skip to content

fix(gateway): cancel active runs during shutdown#1427

Merged
teknium1 merged 1 commit into
mainfrom
fix/1414-gateway-shutdown-restart
Mar 15, 2026
Merged

fix(gateway): cancel active runs during shutdown#1427
teknium1 merged 1 commit into
mainfrom
fix/1414-gateway-shutdown-restart

Conversation

@teknium1

@teknium1 teknium1 commented Mar 15, 2026

Copy link
Copy Markdown
Contributor

Summary

  • track background message-processing tasks spawned by platform adapters
  • interrupt running agents and cancel adapter background tasks during gateway shutdown before adapters disconnect
  • clear shutdown-time pending session state and add regression coverage for restart/shutdown behavior

What this addresses

Issue #1414 reports that after stopping a busy gateway and restarting with hermes gateway run --replace, the old task can appear to keep going, task/progress labels can flicker, and the restarted gateway can fall into a bad state while the previous in-flight work is still unwinding.

I did not reproduce the exact OpenRouter 502 sequence deterministically, but I did isolate a concrete shutdown bug on current main:

  • platform adapters spawn background message-processing tasks and do not track them
  • GatewayRunner.stop() disconnects adapters but does not cancel those tasks
  • GatewayRunner.stop() also does not interrupt agents already recorded in _running_agents

That means an old gateway instance can keep working on in-flight message tasks during shutdown/replacement instead of being cleanly quiesced first.

Test plan

  • source .venv/bin/activate && python -m pytest tests/gateway/test_gateway_shutdown.py -n0 -q
  • source .venv/bin/activate && python -m pytest tests/gateway/test_gateway_shutdown.py tests/gateway/test_interrupt_key_match.py tests/gateway/test_telegram_documents.py tests/gateway/test_telegram_photo_interrupts.py -n0 -q
  • source .venv/bin/activate && python -m pytest tests/gateway/ tests/hermes_cli/test_gateway.py -n0 -q

Track adapter background message-processing tasks, cancel them during gateway shutdown, and interrupt running agents before disconnecting adapters. This prevents old gateway instances from continuing in-flight work after stop/replace, which was contributing to the restart-time task continuation/flicker behavior reported in #1414. Adds regression coverage for adapter task cancellation and shutdown interrupts.
@teknium1 teknium1 merged commit 5254d0b into main Mar 15, 2026
1 check passed
angelburgosrosado pushed a commit to angelburgosrosado/hermes-agent that referenced this pull request Apr 27, 2026
…ay-shutdown-restart

fix(gateway): cancel active runs during shutdown
02356abc pushed a commit to 02356abc/hermes-agent that referenced this pull request May 14, 2026
…ay-shutdown-restart

fix(gateway): cancel active runs during shutdown
olympus-terminal pushed a commit to olympus-terminal/hermes-agent that referenced this pull request May 16, 2026
…ay-shutdown-restart

fix(gateway): cancel active runs during shutdown
Egavasyug pushed a commit to Egavasyug/hermes-agent that referenced this pull request Jun 10, 2026
…ay-shutdown-restart

fix(gateway): cancel active runs during shutdown
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant