Skip to content

test(conftest): plug every gateway-kill leak path#23486

Merged
teknium1 merged 1 commit into
mainfrom
hermes/hermes-03fe30f4
May 11, 2026
Merged

test(conftest): plug every gateway-kill leak path#23486
teknium1 merged 1 commit into
mainfrom
hermes/hermes-03fe30f4

Conversation

@teknium1

Copy link
Copy Markdown
Contributor

Summary

Tests can no longer kill the developer's live hermes-gateway. PR #23397's _live_system_guard blocked os.kill / os.killpg and a narrow subprocess regex; the gateway was still being SIGTERMed today because the guard had structural holes (asyncio subprocesses, os.system/os.popen, pty.spawn, subprocess.getoutput/getstatusoutput, bash -c "systemctl …", sudo systemctl …, pkill -f hermes). All plugged.

Changes

  • tests/conftest.py — guard now wraps every kill primitive in the audit:
    • os.system, os.popen
    • subprocess.getoutput, subprocess.getstatusoutput
    • pty.spawn
    • asyncio.create_subprocess_exec, asyncio.create_subprocess_shell
    • Subprocess command inspection looks at the whole command string, not just tokens[0]. Catches sudo systemctl …, env systemctl …, bash -c "systemctl …", setsid systemctl …, /usr/bin/systemctl …, string shell=True form.
    • New process-killer block: pkill / killall / taskkill / fuser / skill targeting hermes/python patterns.
    • os.kill(0, …) allowed (own group); os.kill(-1, …) refused (every-process broadcast).
    • subprocess.Popen wrapper preserves __class_getitem__ so third-party packages that annotate Popen[bytes] still import (mcp package was breaking).
  • tests/test_live_system_guard_self_test.py (NEW) — 34 self-tests that exercise every primitive against a guaranteed-foreign PID and assert the guard fires. Adding a new kill primitive without updating the guard fails CI.
  • scripts/run_tests.sh — force-loads ~/.hermes/pytest_live_guard.py when present (developer-machine convenience). Lets stale worktrees that predate this PR get the protection through the canonical test wrapper. No-op on CI / fresh machines.

Validation

Before After
Self-test (test_live_system_guard_self_test.py) n/a 34/34 pass
Subprocess regression (test_gateway, test_update_gateway_restart, test_status, test_gateway_shutdown, tests/tools/) 5013 pass + 5 flakes (3 pre-existing on main, 2 xdist ordering) identical
Tests with local monkeypatch.setattr("os.kill", …) (test_update_stale_dashboard, test_mcp_stability, test_local_background_child_hang, test_pty_bridge, test_kanban_db, test_update_gateway_restart) n/a 137/137 pass
mcp package import (was broken by Popen wrapper) broken works
Live hermes-gateway survived all test runs killed 5+ times in 3 days up 36+ minutes

Pre-existing failures NOT touched by this PR (already broken on main):

Companion (not in this PR)

A user-level pytest plugin lives at ~/.hermes/pytest_live_guard.py on developer machines. It's a parallel implementation that's loaded by every pytest invocation (regardless of which worktree is checked out) via:

  • ~/.config/environment.d/50-hermes-live-guard.conf (systemd-user env)
  • BASH_ENV=~/.hermes/.bash_env_test_guard.sh (non-interactive bash subshells)
  • The scripts/run_tests.sh change in this PR

This belt-and-suspenders setup means even stale worktrees from before PR #23397 are protected on this developer's machine. The plugin code is intentionally not in the repo (it's a dev-machine concern, not a CI one).

The existing _live_system_guard (PR #23397) blocked os.kill / os.killpg
and a narrow subset of subprocess invocations. Tests still SIGTERMed the
live gateway today (May 10) because the guard had structural holes.

Plug them all:
- subprocess: also wrap getoutput, getstatusoutput
- os.system, os.popen - completely unwrapped before
- pty.spawn - completely unwrapped before
- asyncio.create_subprocess_exec / create_subprocess_shell - bypassed
  the subprocess module entirely; now wrapped
- Subprocess command inspection now looks at the WHOLE command string,
  not just tokens[0]. Catches sudo systemctl, env systemctl, bash -c
  'systemctl', setsid systemctl, /usr/bin/systemctl, etc.
- New process-killer block: pkill / killall / taskkill / fuser
  targeting hermes/python patterns is now refused
- os.kill PID 0 (own group) allowed; PID -1 (every process we can
  signal) refused
- subprocess.Popen wrapper preserves __class_getitem__ so third-party
  packages that use Popen[bytes] as a type annotation still import

Coverage is locked in by tests/test_live_system_guard_self_test.py -
exercises every primitive against a guaranteed-foreign PID and asserts
the guard fires. Adding a new kill primitive without updating the guard
breaks CI.

scripts/run_tests.sh now also force-loads ~/.hermes/pytest_live_guard.py
when present (developer-machine convenience), so even worktrees that
predate this commit get the protection on subsequent test runs through
the canonical wrapper.
@github-actions

Copy link
Copy Markdown
Contributor

🔎 Lint report: hermes/hermes-03fe30f4 vs origin/main

ruff

Total: 0 on HEAD, 0 on base (➖ 0)

🆕 New issues: none

✅ Fixed issues: none

Unchanged: 0 pre-existing issues carried over.

ty (type checker)

Total: 8123 on HEAD, 8121 on base (🆕 +2)

🆕 New issues (5):

Rule Count
invalid-argument-type 3
unresolved-import 1
unresolved-attribute 1
First entries
run_agent.py:7160: [invalid-argument-type] invalid-argument-type: Argument to function `build_anthropic_client` is incorrect: Expected `str`, found `str | dict[Unknown | str, Unknown | str | dict[str, str]] | Any | ... omitted 3 union elements`
tests/test_live_system_guard_self_test.py:24: [unresolved-import] unresolved-import: Cannot resolve imported module `pytest`
tests/conftest.py:869: [unresolved-attribute] unresolved-attribute: Unresolved attribute `__class_getitem__` on type `def _guarded(cmd, *args, **kwargs) -> Unknown`
run_agent.py:13284: [invalid-argument-type] invalid-argument-type: Argument to function `_is_oauth_token` is incorrect: Expected `str`, found `str | dict[Unknown | str, Unknown | str | dict[str, str]] | Any | ... omitted 3 union elements`
run_agent.py:13287: [invalid-argument-type] invalid-argument-type: Argument to function `len` is incorrect: Expected `Sized`, found `(str & ~AlwaysFalsy) | (dict[Unknown | str, Unknown | str | dict[str, str]] & ~AlwaysFalsy) | (Any & ~AlwaysFalsy) | ... omitted 3 union elements`

✅ Fixed issues (3):

Rule Count
invalid-argument-type 3
First entries
run_agent.py:7160: [invalid-argument-type] invalid-argument-type: Argument to function `build_anthropic_client` is incorrect: Expected `str`, found `str | dict[Unknown, Unknown] | Any | ... omitted 3 union elements`
run_agent.py:13284: [invalid-argument-type] invalid-argument-type: Argument to function `_is_oauth_token` is incorrect: Expected `str`, found `str | dict[Unknown, Unknown] | Any | ... omitted 3 union elements`
run_agent.py:13287: [invalid-argument-type] invalid-argument-type: Argument to function `len` is incorrect: Expected `Sized`, found `(str & ~AlwaysFalsy) | (dict[Unknown, Unknown] & ~AlwaysFalsy) | (Any & ~AlwaysFalsy) | ... omitted 3 union elements`

Unchanged: 4263 pre-existing issues carried over.

Diagnostics are surfaced as warnings — this check never fails the build.

@teknium1 teknium1 merged commit 771b8c4 into main May 11, 2026
15 of 18 checks passed
@teknium1 teknium1 deleted the hermes/hermes-03fe30f4 branch May 11, 2026 01:55
@alt-glitch alt-glitch added type/test Test coverage or test infrastructure P3 Low — cosmetic, nice to have comp/agent Core agent loop, run_agent.py, prompt builder labels May 11, 2026
JZKK720 pushed a commit to JZKK720/hermes-agent that referenced this pull request May 11, 2026
The existing _live_system_guard (PR NousResearch#23397) blocked os.kill / os.killpg
and a narrow subset of subprocess invocations. Tests still SIGTERMed the
live gateway today (May 10) because the guard had structural holes.

Plug them all:
- subprocess: also wrap getoutput, getstatusoutput
- os.system, os.popen - completely unwrapped before
- pty.spawn - completely unwrapped before
- asyncio.create_subprocess_exec / create_subprocess_shell - bypassed
  the subprocess module entirely; now wrapped
- Subprocess command inspection now looks at the WHOLE command string,
  not just tokens[0]. Catches sudo systemctl, env systemctl, bash -c
  'systemctl', setsid systemctl, /usr/bin/systemctl, etc.
- New process-killer block: pkill / killall / taskkill / fuser
  targeting hermes/python patterns is now refused
- os.kill PID 0 (own group) allowed; PID -1 (every process we can
  signal) refused
- subprocess.Popen wrapper preserves __class_getitem__ so third-party
  packages that use Popen[bytes] as a type annotation still import

Coverage is locked in by tests/test_live_system_guard_self_test.py -
exercises every primitive against a guaranteed-foreign PID and asserts
the guard fires. Adding a new kill primitive without updating the guard
breaks CI.

scripts/run_tests.sh now also force-loads ~/.hermes/pytest_live_guard.py
when present (developer-machine convenience), so even worktrees that
predate this commit get the protection on subsequent test runs through
the canonical wrapper.
rmulligan pushed a commit to rmulligan/hermes-agent that referenced this pull request May 11, 2026
The existing _live_system_guard (PR NousResearch#23397) blocked os.kill / os.killpg
and a narrow subset of subprocess invocations. Tests still SIGTERMed the
live gateway today (May 10) because the guard had structural holes.

Plug them all:
- subprocess: also wrap getoutput, getstatusoutput
- os.system, os.popen - completely unwrapped before
- pty.spawn - completely unwrapped before
- asyncio.create_subprocess_exec / create_subprocess_shell - bypassed
  the subprocess module entirely; now wrapped
- Subprocess command inspection now looks at the WHOLE command string,
  not just tokens[0]. Catches sudo systemctl, env systemctl, bash -c
  'systemctl', setsid systemctl, /usr/bin/systemctl, etc.
- New process-killer block: pkill / killall / taskkill / fuser
  targeting hermes/python patterns is now refused
- os.kill PID 0 (own group) allowed; PID -1 (every process we can
  signal) refused
- subprocess.Popen wrapper preserves __class_getitem__ so third-party
  packages that use Popen[bytes] as a type annotation still import

Coverage is locked in by tests/test_live_system_guard_self_test.py -
exercises every primitive against a guaranteed-foreign PID and asserts
the guard fires. Adding a new kill primitive without updating the guard
breaks CI.

scripts/run_tests.sh now also force-loads ~/.hermes/pytest_live_guard.py
when present (developer-machine convenience), so even worktrees that
predate this commit get the protection on subsequent test runs through
the canonical wrapper.
JinyuID pushed a commit to JinyuID/hermes-agent that referenced this pull request May 11, 2026
The existing _live_system_guard (PR NousResearch#23397) blocked os.kill / os.killpg
and a narrow subset of subprocess invocations. Tests still SIGTERMed the
live gateway today (May 10) because the guard had structural holes.

Plug them all:
- subprocess: also wrap getoutput, getstatusoutput
- os.system, os.popen - completely unwrapped before
- pty.spawn - completely unwrapped before
- asyncio.create_subprocess_exec / create_subprocess_shell - bypassed
  the subprocess module entirely; now wrapped
- Subprocess command inspection now looks at the WHOLE command string,
  not just tokens[0]. Catches sudo systemctl, env systemctl, bash -c
  'systemctl', setsid systemctl, /usr/bin/systemctl, etc.
- New process-killer block: pkill / killall / taskkill / fuser
  targeting hermes/python patterns is now refused
- os.kill PID 0 (own group) allowed; PID -1 (every process we can
  signal) refused
- subprocess.Popen wrapper preserves __class_getitem__ so third-party
  packages that use Popen[bytes] as a type annotation still import

Coverage is locked in by tests/test_live_system_guard_self_test.py -
exercises every primitive against a guaranteed-foreign PID and asserts
the guard fires. Adding a new kill primitive without updating the guard
breaks CI.

scripts/run_tests.sh now also force-loads ~/.hermes/pytest_live_guard.py
when present (developer-machine convenience), so even worktrees that
predate this commit get the protection on subsequent test runs through
the canonical wrapper.
02356abc pushed a commit to 02356abc/hermes-agent that referenced this pull request May 14, 2026
The existing _live_system_guard (PR NousResearch#23397) blocked os.kill / os.killpg
and a narrow subset of subprocess invocations. Tests still SIGTERMed the
live gateway today (May 10) because the guard had structural holes.

Plug them all:
- subprocess: also wrap getoutput, getstatusoutput
- os.system, os.popen - completely unwrapped before
- pty.spawn - completely unwrapped before
- asyncio.create_subprocess_exec / create_subprocess_shell - bypassed
  the subprocess module entirely; now wrapped
- Subprocess command inspection now looks at the WHOLE command string,
  not just tokens[0]. Catches sudo systemctl, env systemctl, bash -c
  'systemctl', setsid systemctl, /usr/bin/systemctl, etc.
- New process-killer block: pkill / killall / taskkill / fuser
  targeting hermes/python patterns is now refused
- os.kill PID 0 (own group) allowed; PID -1 (every process we can
  signal) refused
- subprocess.Popen wrapper preserves __class_getitem__ so third-party
  packages that use Popen[bytes] as a type annotation still import

Coverage is locked in by tests/test_live_system_guard_self_test.py -
exercises every primitive against a guaranteed-foreign PID and asserts
the guard fires. Adding a new kill primitive without updating the guard
breaks CI.

scripts/run_tests.sh now also force-loads ~/.hermes/pytest_live_guard.py
when present (developer-machine convenience), so even worktrees that
predate this commit get the protection on subsequent test runs through
the canonical wrapper.
jsboige pushed a commit to jsboige/hermes-agent that referenced this pull request May 14, 2026
The existing _live_system_guard (PR NousResearch#23397) blocked os.kill / os.killpg
and a narrow subset of subprocess invocations. Tests still SIGTERMed the
live gateway today (May 10) because the guard had structural holes.

Plug them all:
- subprocess: also wrap getoutput, getstatusoutput
- os.system, os.popen - completely unwrapped before
- pty.spawn - completely unwrapped before
- asyncio.create_subprocess_exec / create_subprocess_shell - bypassed
  the subprocess module entirely; now wrapped
- Subprocess command inspection now looks at the WHOLE command string,
  not just tokens[0]. Catches sudo systemctl, env systemctl, bash -c
  'systemctl', setsid systemctl, /usr/bin/systemctl, etc.
- New process-killer block: pkill / killall / taskkill / fuser
  targeting hermes/python patterns is now refused
- os.kill PID 0 (own group) allowed; PID -1 (every process we can
  signal) refused
- subprocess.Popen wrapper preserves __class_getitem__ so third-party
  packages that use Popen[bytes] as a type annotation still import

Coverage is locked in by tests/test_live_system_guard_self_test.py -
exercises every primitive against a guaranteed-foreign PID and asserts
the guard fires. Adding a new kill primitive without updating the guard
breaks CI.

scripts/run_tests.sh now also force-loads ~/.hermes/pytest_live_guard.py
when present (developer-machine convenience), so even worktrees that
predate this commit get the protection on subsequent test runs through
the canonical wrapper.
AlexFoxD pushed a commit to AlexFoxD/hermes-agent that referenced this pull request May 21, 2026
The existing _live_system_guard (PR NousResearch#23397) blocked os.kill / os.killpg
and a narrow subset of subprocess invocations. Tests still SIGTERMed the
live gateway today (May 10) because the guard had structural holes.

Plug them all:
- subprocess: also wrap getoutput, getstatusoutput
- os.system, os.popen - completely unwrapped before
- pty.spawn - completely unwrapped before
- asyncio.create_subprocess_exec / create_subprocess_shell - bypassed
  the subprocess module entirely; now wrapped
- Subprocess command inspection now looks at the WHOLE command string,
  not just tokens[0]. Catches sudo systemctl, env systemctl, bash -c
  'systemctl', setsid systemctl, /usr/bin/systemctl, etc.
- New process-killer block: pkill / killall / taskkill / fuser
  targeting hermes/python patterns is now refused
- os.kill PID 0 (own group) allowed; PID -1 (every process we can
  signal) refused
- subprocess.Popen wrapper preserves __class_getitem__ so third-party
  packages that use Popen[bytes] as a type annotation still import

Coverage is locked in by tests/test_live_system_guard_self_test.py -
exercises every primitive against a guaranteed-foreign PID and asserts
the guard fires. Adding a new kill primitive without updating the guard
breaks CI.

scripts/run_tests.sh now also force-loads ~/.hermes/pytest_live_guard.py
when present (developer-machine convenience), so even worktrees that
predate this commit get the protection on subsequent test runs through
the canonical wrapper.
gweeteve pushed a commit to gweeteve/hermes-agent that referenced this pull request Jun 2, 2026
The existing _live_system_guard (PR NousResearch#23397) blocked os.kill / os.killpg
and a narrow subset of subprocess invocations. Tests still SIGTERMed the
live gateway today (May 10) because the guard had structural holes.

Plug them all:
- subprocess: also wrap getoutput, getstatusoutput
- os.system, os.popen - completely unwrapped before
- pty.spawn - completely unwrapped before
- asyncio.create_subprocess_exec / create_subprocess_shell - bypassed
  the subprocess module entirely; now wrapped
- Subprocess command inspection now looks at the WHOLE command string,
  not just tokens[0]. Catches sudo systemctl, env systemctl, bash -c
  'systemctl', setsid systemctl, /usr/bin/systemctl, etc.
- New process-killer block: pkill / killall / taskkill / fuser
  targeting hermes/python patterns is now refused
- os.kill PID 0 (own group) allowed; PID -1 (every process we can
  signal) refused
- subprocess.Popen wrapper preserves __class_getitem__ so third-party
  packages that use Popen[bytes] as a type annotation still import

Coverage is locked in by tests/test_live_system_guard_self_test.py -
exercises every primitive against a guaranteed-foreign PID and asserts
the guard fires. Adding a new kill primitive without updating the guard
breaks CI.

scripts/run_tests.sh now also force-loads ~/.hermes/pytest_live_guard.py
when present (developer-machine convenience), so even worktrees that
predate this commit get the protection on subsequent test runs through
the canonical wrapper.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder P3 Low — cosmetic, nice to have type/test Test coverage or test infrastructure

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants