Skip to content

fix(gateway): exit code 75 on service restart so launchd relaunches#28200

Closed
zccyman wants to merge 1 commit into
NousResearch:mainfrom
atyou2happy:fix/launchd-restart-exit-code-28135
Closed

fix(gateway): exit code 75 on service restart so launchd relaunches#28200
zccyman wants to merge 1 commit into
NousResearch:mainfrom
atyou2happy:fix/launchd-restart-exit-code-28135

Conversation

@zccyman

@zccyman zccyman commented May 18, 2026

Copy link
Copy Markdown
Contributor

Problem

When the gateway receives SIGUSR1 (graceful restart via launchd_restart), the handler calls request_restart(via_service=True) and the gateway shuts down cleanly with exit code 0.

However, the generated launchd plist uses KeepAlive → SuccessfulExit → false, meaning launchd only relaunches on non-zero exit codes. A clean exit(0) is treated as "successful, don't restart", so the gateway stays down after /restart, /update, or SIGUSR1.

The systemd unit template already uses RestartForceExitStatus=75 for the same scenario — the launchd side has no equivalent.

Closes #28135

Fix

When _restart_via_service is True, raise SystemExit(75) instead of returning True. This mirrors the systemd RestartForceExitStatus=75 convention so launchd's SuccessfulExit=false policy triggers a relaunch.

Files Changed

  • gateway/run.py — +13 lines: add exit-code-75 guard before the final return True

Testing

  • 132/132 tests passed in tests/hermes_cli/test_gateway_service.py
  • Existing test coverage for launchd_restart and exit-code-75 paths already present

When the gateway receives SIGUSR1 (graceful restart via launchd_restart),
the SIGUSR1 handler calls request_restart(via_service=True) and the
gateway shuts down cleanly with exit code 0.

However, the generated launchd plist uses KeepAlive → SuccessfulExit →
false, meaning launchd only relaunches on *non-zero* exit codes.  A
clean exit(0) is treated as "successful, don't restart", so the
gateway stays down after /restart, /update, or SIGUSR1.

The systemd unit template already uses RestartForceExitStatus=75 for the
same scenario.  Mirror that convention: when _restart_via_service is
True, raise SystemExit(75) so launchd's SuccessfulExit=false policy
triggers a relaunch.

Closes NousResearch#28135
@alt-glitch alt-glitch added type/bug Something isn't working comp/gateway Gateway runner, session dispatch, delivery P2 Medium — degraded but workaround exists labels May 18, 2026
@alt-glitch

Copy link
Copy Markdown
Collaborator

Duplicate of #24993 — both fix the launchd SIGUSR1 restart path that exits 0 instead of non-zero, causing KeepAlive.SuccessfulExit:false to not relaunch. Different approach (exit code 75 vs kickstart), but same root cause as #11932.

@outsourc-e outsourc-e left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed locally. The touched change is narrow and matches the existing systemd restart-code convention. py_compile passed, the launchd/restart-focused status tests passed, and the broad tests/hermes_cli/test_gateway_service.py failures reproduce on main in this macOS environment, so they are baseline noise rather than introduced by this PR.

@teknium1

Copy link
Copy Markdown
Contributor

Merged via PR #28341 (cherry-picked onto current main with your authorship preserved via rebase-merge — commit 5987b24). Thanks for the contribution!

@teknium1 teknium1 closed this May 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/gateway Gateway runner, session dispatch, delivery P2 Medium — degraded but workaround exists type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

hermes update via gateway leaves launchd service unrestarted on macOS

4 participants