Skip to content

fix: prevent agents from starting gateway outside systemd management#2617

Merged
teknium1 merged 1 commit into
mainfrom
hermes/hermes-1b7b3ffb
Mar 23, 2026
Merged

fix: prevent agents from starting gateway outside systemd management#2617
teknium1 merged 1 commit into
mainfrom
hermes/hermes-1b7b3ffb

Conversation

@teknium1

Copy link
Copy Markdown
Contributor

Problem

An agent session on Telegram was asked to restart the gateway for DNS recovery. It ran:

kill 1605 && cd ~/.hermes/hermes-agent && source venv/bin/activate && python -m hermes_cli.main gateway run --replace &disown

This killed the systemd-managed gateway process and started a replacement with &disown, completely outside systemd's management. The systemd service saw a clean exit (code 0) and with Restart=on-failure, didn't restart. The orphaned gateway ran for ~7 hours until it received SIGTERM, at which point nothing restarted it.

Root Causes

  1. Restart=on-failure in systemd service — clean SIGTERM shutdown exits with code 0, which isn't a 'failure', so systemd never restarts
  2. Agent started gateway with &disown — took it out of systemd management entirely

Fixes

Code changes (this PR)

  • Add dangerous command patterns to tools/approval.py detecting:
    • gateway run with &, disown, or setsid (backgrounding)
    • nohup ... gateway run (detaching from terminal)
  • When detected, the approval message tells the agent to use systemctl --user restart hermes-gateway instead
  • 6 new tests covering all variants

Already applied directly (not in this PR)

  • Fixed systemd service: Restart=on-failureRestart=always, RestartSec=10
  • Applied the approval.py patterns to main repo immediately
  • Restarted gateway via systemd — Telegram, WhatsApp, API server all reconnected

Test plan

  • python -m pytest tests/tools/test_approval.py -n0 -q — 74 passed
  • Full suite: 5948 passed, 0 failures

An agent session killed the systemd-managed gateway (PID 1605) and restarted
it with '&disown', taking it outside systemd's Restart= management. When the
orphaned process later received SIGTERM, nothing restarted it.

Add dangerous command patterns to detect:
- 'gateway run' with & (background), disown, nohup, or setsid
- These should use 'systemctl --user restart hermes-gateway' instead

Also applied directly to main repo and fixed the systemd service:
- Changed Restart=on-failure to Restart=always (clean SIGTERM = exit 0 = not
  a 'failure', so on-failure never triggered)
- RestartSec=10 for reasonable restart delay
@teknium1 teknium1 merged commit 93dc5de into main Mar 23, 2026
1 check passed
InB4DevOps pushed a commit to InB4DevOps/hermes-agent that referenced this pull request Mar 25, 2026
…ousResearch#2617)

An agent session killed the systemd-managed gateway (PID 1605) and restarted
it with '&disown', taking it outside systemd's Restart= management. When the
orphaned process later received SIGTERM, nothing restarted it.

Add dangerous command patterns to detect:
- 'gateway run' with & (background), disown, nohup, or setsid
- These should use 'systemctl --user restart hermes-gateway' instead

Also applied directly to main repo and fixed the systemd service:
- Changed Restart=on-failure to Restart=always (clean SIGTERM = exit 0 = not
  a 'failure', so on-failure never triggered)
- RestartSec=10 for reasonable restart delay
outsourc-e pushed a commit to outsourc-e/hermes-agent that referenced this pull request Mar 26, 2026
…ousResearch#2617)

An agent session killed the systemd-managed gateway (PID 1605) and restarted
it with '&disown', taking it outside systemd's Restart= management. When the
orphaned process later received SIGTERM, nothing restarted it.

Add dangerous command patterns to detect:
- 'gateway run' with & (background), disown, nohup, or setsid
- These should use 'systemctl --user restart hermes-gateway' instead

Also applied directly to main repo and fixed the systemd service:
- Changed Restart=on-failure to Restart=always (clean SIGTERM = exit 0 = not
  a 'failure', so on-failure never triggered)
- RestartSec=10 for reasonable restart delay
angelburgosrosado pushed a commit to angelburgosrosado/hermes-agent that referenced this pull request Apr 27, 2026
…ousResearch#2617)

An agent session killed the systemd-managed gateway (PID 1605) and restarted
it with '&disown', taking it outside systemd's Restart= management. When the
orphaned process later received SIGTERM, nothing restarted it.

Add dangerous command patterns to detect:
- 'gateway run' with & (background), disown, nohup, or setsid
- These should use 'systemctl --user restart hermes-gateway' instead

Also applied directly to main repo and fixed the systemd service:
- Changed Restart=on-failure to Restart=always (clean SIGTERM = exit 0 = not
  a 'failure', so on-failure never triggered)
- RestartSec=10 for reasonable restart delay
02356abc pushed a commit to 02356abc/hermes-agent that referenced this pull request May 14, 2026
…ousResearch#2617)

An agent session killed the systemd-managed gateway (PID 1605) and restarted
it with '&disown', taking it outside systemd's Restart= management. When the
orphaned process later received SIGTERM, nothing restarted it.

Add dangerous command patterns to detect:
- 'gateway run' with & (background), disown, nohup, or setsid
- These should use 'systemctl --user restart hermes-gateway' instead

Also applied directly to main repo and fixed the systemd service:
- Changed Restart=on-failure to Restart=always (clean SIGTERM = exit 0 = not
  a 'failure', so on-failure never triggered)
- RestartSec=10 for reasonable restart delay
olympus-terminal pushed a commit to olympus-terminal/hermes-agent that referenced this pull request May 16, 2026
…ousResearch#2617)

An agent session killed the systemd-managed gateway (PID 1605) and restarted
it with '&disown', taking it outside systemd's Restart= management. When the
orphaned process later received SIGTERM, nothing restarted it.

Add dangerous command patterns to detect:
- 'gateway run' with & (background), disown, nohup, or setsid
- These should use 'systemctl --user restart hermes-gateway' instead

Also applied directly to main repo and fixed the systemd service:
- Changed Restart=on-failure to Restart=always (clean SIGTERM = exit 0 = not
  a 'failure', so on-failure never triggered)
- RestartSec=10 for reasonable restart delay
gweeteve pushed a commit to gweeteve/hermes-agent that referenced this pull request Jun 2, 2026
…ousResearch#2617)

An agent session killed the systemd-managed gateway (PID 1605) and restarted
it with '&disown', taking it outside systemd's Restart= management. When the
orphaned process later received SIGTERM, nothing restarted it.

Add dangerous command patterns to detect:
- 'gateway run' with & (background), disown, nohup, or setsid
- These should use 'systemctl --user restart hermes-gateway' instead

Also applied directly to main repo and fixed the systemd service:
- Changed Restart=on-failure to Restart=always (clean SIGTERM = exit 0 = not
  a 'failure', so on-failure never triggered)
- RestartSec=10 for reasonable restart delay
Egavasyug pushed a commit to Egavasyug/hermes-agent that referenced this pull request Jun 10, 2026
…ousResearch#2617)

An agent session killed the systemd-managed gateway (PID 1605) and restarted
it with '&disown', taking it outside systemd's Restart= management. When the
orphaned process later received SIGTERM, nothing restarted it.

Add dangerous command patterns to detect:
- 'gateway run' with & (background), disown, nohup, or setsid
- These should use 'systemctl --user restart hermes-gateway' instead

Also applied directly to main repo and fixed the systemd service:
- Changed Restart=on-failure to Restart=always (clean SIGTERM = exit 0 = not
  a 'failure', so on-failure never triggered)
- RestartSec=10 for reasonable restart delay
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant