Bug Description
Hermes Version: 0.8.0
Environment: WSL2 Linux, systemd user service, HTTP proxy required for external access
When the agent running inside a Telegram gateway session executes a gateway restart, the gateway process dies and never comes back up. The restart mechanism does not properly use systemd and instead tries to manage the process directly, causing a PID race condition.
Steps to Reproduce
- Start gateway as systemd user service: systemctl --user start hermes-gateway
- Send a message to the bot via Telegram asking it to restart the gateway
- The agent executes a restart command (likely "hermes gateway restart" or "hermes gateway run --replace")
- Gateway stops and never recovers
LOG EVIDENCE:
[Event 1 - 21:14:45] Agent triggered restart from Telegram session:
gateway.platforms.telegram: Telegram button resolved 1 approval(s) for session agent:main:telegram:dm:5004002140
gateway.run: Stopping gateway...
gateway.run: ✓ discord disconnected
gateway.run: ✓ telegram disconnected
gateway.run: Gateway stopped
gateway.run: Cron ticker stopped
[After restart attempt]:
❌ Gateway already running (PID 18664).
Use 'hermes gateway restart' to replace it,
or 'hermes gateway stop' to kill it first.
Or use 'hermes gateway run --replace' to auto-replace.
Expected Behavior
Gateway restart triggered from any platform (Telegram, Discord, CLI) should use "systemctl --user restart hermes-gateway" to properly manage the lifecycle, ensuring:
- No PID race condition
- Service file customizations (like proxy Environment overrides) are preserved
- The gateway reliably comes back up after restart
Actual Behavior
Manually run: systemctl --user start hermes-gateway
Must also maintain a systemd override file at:
~/.config/systemd/user/hermes-gateway.service.d/override.conf
to preserve proxy environment variables across restarts.
Affected Component
Gateway (Telegram/Discord/Slack/WhatsApp)
Messaging Platform (if gateway-related)
Telegram, Discord
Operating System
Hermes Version: 0.8.0 Environment: WSL2 Linux, systemd user service, HTTP proxy required for external access
Python Version
3.11.9
Hermes Version
Hermes Version: 0.8.0
Relevant Logs / Traceback
Root Cause Analysis (optional)
LOG EVIDENCE:
[Event 1 - 21:14:45] Agent triggered restart from Telegram session:
gateway.platforms.telegram: Telegram button resolved 1 approval(s) for session agent:main:telegram:dm:5004002140
gateway.run: Stopping gateway...
gateway.run: ✓ discord disconnected
gateway.run: ✓ telegram disconnected
gateway.run: Gateway stopped
gateway.run: Cron ticker stopped
[After restart attempt]:
❌ Gateway already running (PID 18664).
Use 'hermes gateway restart' to replace it,
or 'hermes gateway stop' to kill it first.
Or use 'hermes gateway run --replace' to auto-replace.
[Event 2 - 22:16:18] Same issue repeated:
gateway.run: Stopping gateway...
gateway.run: ✓ telegram disconnected
gateway.run: Gateway stopped
(service exits, never restarts)
ROOT CAUSE ANALYSIS:
- When restart is triggered from within the gateway (via Telegram agent session), the agent runs a command like "hermes gateway restart" or "hermes gateway run --replace"
- This kills the current gateway process (which is also hosting the agent session)
- The restart command tries to start a new process using nohup/direct execution instead of systemd
- PID race condition: the new process detects the old PID still exists and exits
- The old process then exits too, leaving no gateway running
- When running as systemd service, "hermes gateway restart" regenerates the service file, which can also strip custom Environment overrides (e.g., HTTP proxy settings)
EXPECTED BEHAVIOR:
Gateway restart triggered from any platform (Telegram, Discord, CLI) should use "systemctl --user restart hermes-gateway" to properly manage the lifecycle, ensuring:
- No PID race condition
- Service file customizations (like proxy Environment overrides) are preserved
- The gateway reliably comes back up after restart
WORKAROUND:
Manually run: systemctl --user start hermes-gateway
Must also maintain a systemd override file at:
~/.config/systemd/user/hermes-gateway.service.d/override.conf
to preserve proxy environment variables across restarts.
Proposed Fix (optional)
No response
Are you willing to submit a PR for this?
Bug Description
Hermes Version: 0.8.0
Environment: WSL2 Linux, systemd user service, HTTP proxy required for external access
When the agent running inside a Telegram gateway session executes a gateway restart, the gateway process dies and never comes back up. The restart mechanism does not properly use systemd and instead tries to manage the process directly, causing a PID race condition.
Steps to Reproduce
LOG EVIDENCE:
[Event 1 - 21:14:45] Agent triggered restart from Telegram session:
gateway.platforms.telegram: Telegram button resolved 1 approval(s) for session agent:main:telegram:dm:5004002140
gateway.run: Stopping gateway...
gateway.run: ✓ discord disconnected
gateway.run: ✓ telegram disconnected
gateway.run: Gateway stopped
gateway.run: Cron ticker stopped
[After restart attempt]:
❌ Gateway already running (PID 18664).
Use 'hermes gateway restart' to replace it,
or 'hermes gateway stop' to kill it first.
Or use 'hermes gateway run --replace' to auto-replace.
Expected Behavior
Gateway restart triggered from any platform (Telegram, Discord, CLI) should use "systemctl --user restart hermes-gateway" to properly manage the lifecycle, ensuring:
Actual Behavior
Manually run: systemctl --user start hermes-gateway
Must also maintain a systemd override file at:
~/.config/systemd/user/hermes-gateway.service.d/override.conf
to preserve proxy environment variables across restarts.
Affected Component
Gateway (Telegram/Discord/Slack/WhatsApp)
Messaging Platform (if gateway-related)
Telegram, Discord
Operating System
Hermes Version: 0.8.0 Environment: WSL2 Linux, systemd user service, HTTP proxy required for external access
Python Version
3.11.9
Hermes Version
Hermes Version: 0.8.0
Relevant Logs / Traceback
Root Cause Analysis (optional)
LOG EVIDENCE:
[Event 1 - 21:14:45] Agent triggered restart from Telegram session:
gateway.platforms.telegram: Telegram button resolved 1 approval(s) for session agent:main:telegram:dm:5004002140
gateway.run: Stopping gateway...
gateway.run: ✓ discord disconnected
gateway.run: ✓ telegram disconnected
gateway.run: Gateway stopped
gateway.run: Cron ticker stopped
[After restart attempt]:
❌ Gateway already running (PID 18664).
Use 'hermes gateway restart' to replace it,
or 'hermes gateway stop' to kill it first.
Or use 'hermes gateway run --replace' to auto-replace.
[Event 2 - 22:16:18] Same issue repeated:
gateway.run: Stopping gateway...
gateway.run: ✓ telegram disconnected
gateway.run: Gateway stopped
(service exits, never restarts)
ROOT CAUSE ANALYSIS:
EXPECTED BEHAVIOR:
Gateway restart triggered from any platform (Telegram, Discord, CLI) should use "systemctl --user restart hermes-gateway" to properly manage the lifecycle, ensuring:
WORKAROUND:
Manually run: systemctl --user start hermes-gateway
Must also maintain a systemd override file at:
~/.config/systemd/user/hermes-gateway.service.d/override.conf
to preserve proxy environment variables across restarts.
Proposed Fix (optional)
No response
Are you willing to submit a PR for this?