Skip to content

fix(gateway): verify launchd restart relaunches#27691

Open
gbondy2 wants to merge 1 commit into
NousResearch:mainfrom
gbondy2:fix/launchd-restart-verification
Open

fix(gateway): verify launchd restart relaunches#27691
gbondy2 wants to merge 1 commit into
NousResearch:mainfrom
gbondy2:fix/launchd-restart-verification

Conversation

@gbondy2

@gbondy2 gbondy2 commented May 18, 2026

Copy link
Copy Markdown

Summary

  • Verify macOS launchd gateway self-restart actually relaunches with a new PID
  • Fall back to launchctl kickstart/bootstrap when the gateway does not come back
  • Add regression coverage for unloaded/missing launchd job recovery and host-independent systemd tests

Test Plan

  • python -m pytest tests/hermes_cli/test_gateway_service.py tests/hermes_cli/test_update_gateway_restart.py -q -o 'addopts='
  • Result: 180 passed

@alt-glitch alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists comp/gateway Gateway runner, session dispatch, delivery comp/cli CLI entry point, hermes_cli/, setup wizard labels May 18, 2026
@teknium1

Copy link
Copy Markdown
Contributor

Thanks for the focused launchd restart fix. I verified the premise still exists on current main: launchd_restart() returns immediately after _request_gateway_self_restart(pid) succeeds at hermes_cli/gateway.py:3667-3669, and the current test still asserts launchctl should not run in that path at tests/hermes_cli/test_gateway_service.py:697-718.

Problems

  • The new fallback helper in this PR only treats launchctl exit codes 3 and 113 as an unloaded job. Current main now intentionally treats 125 as unloaded too, and falls back to detached background launch when launchd cannot manage the domain (5/persistent 125) at hermes_cli/gateway.py:3209-3218 and hermes_cli/gateway.py:3683-3706. A salvage should preserve that newer handling while adding the relaunch verification.

Suggested changes

  • Integrate the relaunch/new-PID wait into current main’s launchd_restart(), but reuse _launchd_error_indicates_unloaded() and _launchctl_domain_unsupported() rather than hard-coding {3, 113}.
  • Update tests/hermes_cli/test_gateway_service.py:697-718 so the successful self-restart path verifies relaunch behavior instead of asserting launchctl is never invoked, and keep coverage for the current 5/125 detached fallback.

This is an automated hermes-sweeper review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/cli CLI entry point, hermes_cli/, setup wizard comp/gateway Gateway runner, session dispatch, delivery P2 Medium — degraded but workaround exists type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants