You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Fixes the Docker/s6 gateway auto-start regression from #42675.
When the gateway receives an unexpected external SIGTERM, start_gateway() already exits non-zero so a service manager can revive it. However, the gateway teardown still persisted gateway_state="stopped". In Docker s6 deployments, container_boot only auto-starts profile gateway services whose previous state is running, so a container restart or image upgrade could leave messaging channels dark even though the gateway had been running before the supervisor stopped it.
This PR preserves gateway_state="running" for unexpected signal-initiated shutdowns while keeping intentional stops and planned restarts as stopped.
🐛 Bug fix (non-breaking change that fixes an issue)
✨ New feature (non-breaking change that adds functionality)
🔒 Security fix
📝 Documentation update
✅ Tests (adding or improving test coverage)
♻️ Refactor (no behavior change)
🎯 New skill (bundled or hub)
Changes Made
gateway/run.py
Track unexpected signal-initiated shutdowns separately from planned stops/restarts.
Preserve the persisted runtime gateway state as running for those external signal shutdowns so Docker/s6 reconciliation can bring the gateway back up on the next container boot.
Keep planned stop/restart behavior unchanged.
tests/gateway/test_gateway_shutdown.py
Add regression coverage that signal-preserved shutdowns persist running.
tests/gateway/restart_test_helpers.py
Initialize the new runner test flag in the object-constructed test helper.
How to Test
Reproduce the bug condition: a running Docker/s6 gateway receives an external SIGTERM during container restart/upgrade; prior behavior writes gateway_state="stopped", so container_boot registers the service down on the next boot.
Verify the fix: unexpected signal shutdown now sets the preserve flag and final runtime status remains running; planned stops/restarts still use the existing paths.
✅ Verified — signal-initiated shutdown preserves running state correctly
Reviewed the _preserve_gateway_running_state_on_stop flag flow across shutdown_signal_handler → _stop_impl.
Flag initialization: __init__ defaults to False; restart_test_helpers.py mirror updated. Confirmed in gateway/run.py:1956.
Signal-only activation: The flag is set to True only in the else branch of shutdown_signal_handler (non-restart signal path), correctly excluding restart-initiated shutdowns where _restart_requested=True already controls the state.
State logic: _update_runtime_status receives "running" only when the flag is set AND _restart_requested is False. The not self._restart_requested guard prevents the flag from interfering with normal restart flows.
Test coverage: test_unexpected_signal_shutdown_preserves_running_state_for_service_revival verifies the "running" status is written and the shutdown event is still set (process actually stops).
The fix is correct — service managers (launchd, systemd) that monitor the status file will see "running" and restart the process after an unexpected SIGTERM/SIGINT. No issues found.
Superseded by #43236 (merged), which fixes #42675 by persisting running on a signal-initiated shutdown — the same end goal as this PR (_preserve_gateway_running_state_on_stop), but gated on the planned-stop marker classification rather than a 'was it a signal' flag set in the handler.
The distinction matters for the cases in #42517's table: an external kill/OOM under systemd is also a bare signal, and the marker is what separates those from an intentional stop. #43236 additionally wires the marker into the s6 stop() path so in-container hermes gateway stop is classified correctly, and verifies the whole thing against a real docker restart (fails on a pre-fix image, passes on the fixed one). Appreciate the work here.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
area/dockerDocker image, Compose, packagingbackend/dockerDocker container executioncomp/gatewayGateway runner, session dispatch, deliveryP1High — major feature broken, no workaroundtype/bugSomething isn't working
4 participants
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What does this PR do?
Fixes the Docker/s6 gateway auto-start regression from #42675.
When the gateway receives an unexpected external SIGTERM,
start_gateway()already exits non-zero so a service manager can revive it. However, the gateway teardown still persistedgateway_state="stopped". In Docker s6 deployments,container_bootonly auto-starts profile gateway services whose previous state isrunning, so a container restart or image upgrade could leave messaging channels dark even though the gateway had been running before the supervisor stopped it.This PR preserves
gateway_state="running"for unexpected signal-initiated shutdowns while keeping intentional stops and planned restarts asstopped.Related Issue
Fixes #42675
Type of Change
Changes Made
gateway/run.pyrunningfor those external signal shutdowns so Docker/s6 reconciliation can bring the gateway back up on the next container boot.tests/gateway/test_gateway_shutdown.pyrunning.tests/gateway/restart_test_helpers.pyHow to Test
gateway_state="stopped", socontainer_bootregisters the service down on the next boot.running; planned stops/restarts still use the existing paths..venv/bin/python -m pytest -o 'addopts=' tests/gateway/test_gateway_shutdown.py -q→15 passed in 35.06s.venv/bin/python -m compileall -q gateway/run.py tests/gateway/test_gateway_shutdown.py tests/gateway/restart_test_helpers.py.venv/bin/python -m ruff check gateway/run.py tests/gateway/test_gateway_shutdown.py tests/gateway/restart_test_helpers.py→All checks passed!git diff --checkChecklist
Code
fix(scope):,feat(scope):, etc.)pytest tests/ -qand all tests passDocumentation & Housekeeping
docs/, docstrings) — or N/Acli-config.yaml.exampleif I added/changed config keys — or N/ACONTRIBUTING.mdorAGENTS.mdif I changed architecture or workflows — or N/AFor New Skills
N/A
Screenshots / Logs