Summary
When running multiple Hermes profiles concurrently on the same machine (each with its own HERMES_HOME and launchd service), several operations are not profile-safe and silently kill other profiles' gateways.
Reproduction
- Install two profiles (e.g. default +
hermes-m4-sf-sales), each with their own launchd plist and Telegram bot token.
- Both running fine via
launchctl list.
- Run
hermes gateway stop, hermes gateway restart, or hermes update under any profile.
- Observe that all gateway processes across all profiles receive SIGTERM.
Bug 1: find_gateway_pids() finds ALL gateway processes
File: hermes_cli/gateway.py — find_gateway_pids()
The function does ps aux and matches any process containing "hermes_cli.main gateway" in the command line. It does not filter by HERMES_HOME. This means:
hermes gateway stop kills every gateway on the machine
hermes gateway restart kills every gateway, then only restarts its own
hermes update kills every gateway, then only restarts its own
- Any agent-initiated
hermes gateway stop (e.g. from a running session) kills all profiles
Fix: Filter PIDs by their HERMES_HOME environment variable. On macOS, this can be done with ps -E or by reading /proc/PID/environ on Linux. Only include PIDs whose HERMES_HOME matches the current profile.
Bug 2: release_all_scoped_locks() nukes all lock files
File: gateway/status.py — release_all_scoped_locks()
Called during --replace to clean up stale locks. It deletes every .lock file in ~/.local/state/hermes/gateway-locks/, including locks actively held by other profiles. This can cause a second profile to lose its Telegram token lock, leading to duplicate polling or fatal errors.
Fix: Only release locks owned by the calling process' PID, or scoped to the calling profile's HERMES_HOME.
Bug 3 (latent): KeepAlive SuccessfulExit=false + clean SIGTERM = permanent death
File: hermes_cli/gateway.py — generate_launchd_plist()
The generated plist uses:
<key>KeepAlive</key>
<dict>
<key>SuccessfulExit</key>
<false/>
</dict>
When the gateway receives SIGTERM (from bug 1 above, or any source), it shuts down cleanly and exits with code 0. launchd treats exit(0) as "successful" and does not restart the service. The profile stays dead until manually kickstarted.
This is latent (not a problem on its own) but becomes catastrophic when combined with bug 1 — any profile's gateway stop/update/restart permanently kills all other profiles.
Fix: Use <key>KeepAlive</key><true/> instead. A gateway daemon should always be restarted regardless of exit code.
Impact
Any multi-profile setup is fragile. A single hermes update on the default profile will silently and permanently kill all other profile gateways. The only recovery is manual launchctl kickstart.
Environment
- macOS Sequoia, launchd service management
- Multiple profiles under
~/.hermes/profiles/
- Hermes version: latest (installed via
hermes update)
Summary
When running multiple Hermes profiles concurrently on the same machine (each with its own
HERMES_HOMEand launchd service), several operations are not profile-safe and silently kill other profiles' gateways.Reproduction
hermes-m4-sf-sales), each with their own launchd plist and Telegram bot token.launchctl list.hermes gateway stop,hermes gateway restart, orhermes updateunder any profile.Bug 1:
find_gateway_pids()finds ALL gateway processesFile:
hermes_cli/gateway.py—find_gateway_pids()The function does
ps auxand matches any process containing"hermes_cli.main gateway"in the command line. It does not filter byHERMES_HOME. This means:hermes gateway stopkills every gateway on the machinehermes gateway restartkills every gateway, then only restarts its ownhermes updatekills every gateway, then only restarts its ownhermes gateway stop(e.g. from a running session) kills all profilesFix: Filter PIDs by their
HERMES_HOMEenvironment variable. On macOS, this can be done withps -Eor by reading/proc/PID/environon Linux. Only include PIDs whoseHERMES_HOMEmatches the current profile.Bug 2:
release_all_scoped_locks()nukes all lock filesFile:
gateway/status.py—release_all_scoped_locks()Called during
--replaceto clean up stale locks. It deletes every.lockfile in~/.local/state/hermes/gateway-locks/, including locks actively held by other profiles. This can cause a second profile to lose its Telegram token lock, leading to duplicate polling or fatal errors.Fix: Only release locks owned by the calling process' PID, or scoped to the calling profile's
HERMES_HOME.Bug 3 (latent):
KeepAlive SuccessfulExit=false+ clean SIGTERM = permanent deathFile:
hermes_cli/gateway.py—generate_launchd_plist()The generated plist uses:
When the gateway receives SIGTERM (from bug 1 above, or any source), it shuts down cleanly and exits with code 0. launchd treats exit(0) as "successful" and does not restart the service. The profile stays dead until manually
kickstarted.This is latent (not a problem on its own) but becomes catastrophic when combined with bug 1 — any profile's
gateway stop/update/restartpermanently kills all other profiles.Fix: Use
<key>KeepAlive</key><true/>instead. A gateway daemon should always be restarted regardless of exit code.Impact
Any multi-profile setup is fragile. A single
hermes updateon the default profile will silently and permanently kill all other profile gateways. The only recovery is manuallaunchctl kickstart.Environment
~/.hermes/profiles/hermes update)