Summary
On a launchd-managed gateway (macOS), the /restart slash command (and hermes gateway restart via the same path) stops the gateway but never relaunches it — the gateway exits 0, and the generated plist's KeepAlive.SuccessfulExit=false treats a clean exit as success, so launchd does not revive it. The agent goes silently unreachable until a manual launchctl kickstart.
Environment
- Hermes v0.16.0 (2026.6.5), upstream
49dd776d
- macOS 26.5.1 (launchd), gateway run as a per-user LaunchAgent (
KeepAlive.SuccessfulExit=false, RunAtLoad=true)
Reproduction
- Profile gateway managed by launchd (the default macOS install).
- In Discord (or any platform): send
/restart.
- Gateway stops; nothing relaunches it.
/research-ops, /restart, etc. get no reply. launchctl list shows pid = -.
- Recovery:
launchctl kickstart -k gui/$(id -u)/ai.hermes.gateway-<profile>.
Log evidence (profile gateway.log)
[Discord] slash '/restart' invoked by user=…
gateway.run: Stopping gateway for restart...
gateway.run: Gateway stopped (total teardown 0.08s)
← nothing. dead until a manual kickstart 12 min later.
Contrast a signal shutdown the same day, which exits non-zero and is revived:
Exiting with code 1 (signal-initiated shutdown without restart request)
so systemd Restart=on-failure can revive the gateway.
The /restart path logs no "Exiting with code …" line — it returns success.
Root cause (gateway/run.py, start_gateway() exit logic)
The exit decision after a planned restart:
if _signal_initiated_shutdown and not runner._restart_requested:
... return False # → sys.exit(1) (signal path: revived)
if runner._restart_via_service:
raise SystemExit(75) # systemd path: revived
return True # ← /restart on launchd lands HERE → exit 0
A slash-command /restart sets _restart_requested=True but not _signal_initiated_shutdown and not _restart_via_service (that flag is for systemd installs). It therefore falls through to return True → main() sees success → process exits 0.
The restart dispatch itself (gateway/run.py, the _restart_via_service branch) already documents the gap:
if self._restart_requested and self._restart_via_service:
self._launch_systemd_restart_shortcut()
# ... launchd's KeepAlive.SuccessfulExit=false needs a non-zero exit to [revive]
…but there is no launchd-equivalent branch for the non-systemd case, so launchd-managed /restart exits clean and launchd (correctly, per SuccessfulExit=false) declines to revive.
Impact
- Severity: high. A documented operator command (
/restart) takes the agent permanently offline with no error surfaced to the user. Reproduced repeatedly. The agent simply stops responding.
Suggested fix (any one)
- On launchd-managed gateways, a planned
/restart should exit non-zero (e.g. reuse SystemExit(75)) so KeepAlive.SuccessfulExit=false relaunches it.
- …or route the launchd
/restart through the detached respawn watcher (launch_detached_profile_gateway_restart) the way hermes update does.
- …or
launchctl kickstart -k <label> the job directly (the verb that works on macOS 15+, unlike bootout/bootstrap which return exit 5).
Detection of "managed by launchd" already exists in hermes_cli/gateway.py; the exit/respawn path just needs a launchd arm equivalent to the systemd one.
Summary
On a launchd-managed gateway (macOS), the
/restartslash command (andhermes gateway restartvia the same path) stops the gateway but never relaunches it — the gateway exits0, and the generated plist'sKeepAlive.SuccessfulExit=falsetreats a clean exit as success, so launchd does not revive it. The agent goes silently unreachable until a manuallaunchctl kickstart.Environment
49dd776dKeepAlive.SuccessfulExit=false,RunAtLoad=true)Reproduction
/restart./research-ops,/restart, etc. get no reply.launchctl listshowspid = -.launchctl kickstart -k gui/$(id -u)/ai.hermes.gateway-<profile>.Log evidence (profile gateway.log)
Contrast a signal shutdown the same day, which exits non-zero and is revived:
The
/restartpath logs no "Exiting with code …" line — it returns success.Root cause (
gateway/run.py,start_gateway()exit logic)The exit decision after a planned restart:
A slash-command
/restartsets_restart_requested=Truebut not_signal_initiated_shutdownand not_restart_via_service(that flag is for systemd installs). It therefore falls through toreturn True→main()sees success → process exits 0.The restart dispatch itself (
gateway/run.py, the_restart_via_servicebranch) already documents the gap:…but there is no launchd-equivalent branch for the non-systemd case, so launchd-managed
/restartexits clean and launchd (correctly, perSuccessfulExit=false) declines to revive.Impact
/restart) takes the agent permanently offline with no error surfaced to the user. Reproduced repeatedly. The agent simply stops responding.Suggested fix (any one)
/restartshould exit non-zero (e.g. reuseSystemExit(75)) soKeepAlive.SuccessfulExit=falserelaunches it./restartthrough the detached respawn watcher (launch_detached_profile_gateway_restart) the wayhermes updatedoes.launchctl kickstart -k <label>the job directly (the verb that works on macOS 15+, unlikebootout/bootstrapwhich return exit 5).Detection of "managed by launchd" already exists in
hermes_cli/gateway.py; the exit/respawn path just needs a launchd arm equivalent to the systemd one.