Skip to content

feat(egress): host hardening survey — follow-up to #35149#4

Closed
Bartok9 wants to merge 1 commit into
feat/iron-proxy-doctor-audit-anthropicfrom
feat/iron-proxy-harden-survey
Closed

feat(egress): host hardening survey — follow-up to #35149#4
Bartok9 wants to merge 1 commit into
feat/iron-proxy-doctor-audit-anthropicfrom
feat/iron-proxy-harden-survey

Conversation

@Bartok9

@Bartok9 Bartok9 commented May 30, 2026

Copy link
Copy Markdown
Owner

Why this complements NousResearch#30179 and NousResearch#35149

NousResearch#30179 introduced the iron-proxy sandbox-egress layer (swap opaque proxy tokens for real upstream creds at the network boundary, so a prompt-injected agent never sees real keys). NousResearch#35149 added hermes egress doctor to answer "is the egress proxy itself healthy?".

This PR adds the missing half of the picture: host-perimeter hardening. hermes egress harden answers "is the machine the proxy runs on locked down?" — firewall, SSH config, fail2ban, mesh-VPN, Docker seccomp — and folds the two iron-proxy runtime signals in so an operator sees the whole defense-in-depth stack in one table.

The two layers solve different threats and neither substitutes for the other:

  • A firewall does nothing against a prompt-injected agent that already runs inside your sandbox and POSTs OPENAI_API_KEY to an attacker. iron-proxy stops that.
  • The egress proxy does nothing about an open SSH port with password auth. Perimeter hardening stops that.

Stack: NousResearch#30179NousResearch#35149this PR. Parallel-independent to the CA-rotation sibling — no file overlap.

What lands

  • New module agent/proxy_sources/host_hardening.py — stdlib-only, no side effects, every probe graceful (missing binary / non-Linux host → skip, never fail). Reuses get_status() from iron_proxy.py for signals 8 + 9 (the only touch to the existing proxy code is a from import).
  • New CLI subcommand hermes egress harden (one sub-parser entry + one handler cmd_harden in hermes_cli/proxy_cli.py):
    • --baseline {minimal,catalin,paranoid} (default minimal)
    • --json{"signals":[{name,status,detail,fix}],"baseline":str,"satisfied":bool,"missing":[str]}
    • --all → show passing signals too (default: gaps-only)
    • Always exits 0 — informational, never gates anything.
  • 25 hermetic tests in tests/test_iron_proxy_harden.py (mirrors test_iron_proxy_doctor.py style).
  • New docs page website/docs/user-guide/egress/hardening-baselines.md + a "Host hardening" section in iron-proxy.md and a harden block in cli-commands.md.

New surfaces

Surface Kind Notes
host_hardening.HardeningSignal dataclass name / status / fix(Optional) + detail; to_dict() for --json
host_hardening.survey_host(*, baseline="minimal") function returns list[HardeningSignal] (always probes all 10)
host_hardening.baseline_status(signals, baseline) function (satisfied, missing) — pure evaluation
host_hardening.BASELINES / BASELINE_NAMES / SIGNAL_NAMES constants for CLI choices + tests
hermes egress harden CLI --baseline / --json / --all, exit 0

The 10 signals

# Signal Probe
1 tailscale tailscale status --jsonBackendState=Running
2 ufw ufw status verbose → active + default-deny incoming
3 firewalld firewall-cmd --state → running
4 nftables nft list ruleset → non-empty
5 fail2ban fail2ban-client status → ≥1 jail
6 ssh-password-auth ^PasswordAuthentication no in sshd_config
7 ssh-root-login ^PermitRootLogin (no|prohibit-password)
8 iron-proxy-enabled reuse get_status()
9 iron-proxy-running reuse get_status() (pid alive + listening)
10 docker-seccomp docker info SecurityOptions includes seccomp

Baselines (color the summary line only — never force a fail):

  • minimal — any one firewall (ufw/firewalld/nftables/tailscale) + ssh-password-auth + iron-proxy-enabled
  • catalin — tailscale + ufw + fail2ban + ssh-password-auth + iron-proxy-enabled
  • paranoid — all 10

Failure modes considered

  • Hung firewall/VPN binary — every probe runs with a 4s timeout via a shared _run() helper; a hang → skip, never a wedged survey.
  • Non-Linux hostufw / firewalld / nftables / fail2ban skip cleanly on macOS; Tailscale / SSH-config / iron-proxy / Docker still run.
  • Missing /etc/ssh/sshd_config — both SSH signals skip (not fail); many dev boxes / minimal containers don't run sshd.
  • get_status() raising — wrapped; the iron-proxy signals degrade to skip, never crash the survey.
  • Non-JSON / partial command outputtailscale non-JSON → warn; nft/docker non-zero exit → warn with an actionable hint.
  • Last-directive-wins — sshd_config parsing scans all matches and takes the last effective directive.

Validation

pytest tests/test_iron_proxy*.py -v
Before (NousResearch#35149 branch) After (this PR)
iron-proxy suite total 155 passed, 1 skipped 180 passed, 1 skipped
new test_iron_proxy_harden.py 25 passed
existing iron-proxy tests 155 passed 155 still pass (unchanged)

Real survey on a macOS dev host (hermes egress harden --json) runs clean, exit 0, with Linux-only signals degrading to skip — confirms graceful cross-platform behavior.

Coverage gaps

  • The iron-proxy enabled signal treats a generated CA + proxy.yaml as "enabled" because get_status() doesn't read config.yaml's proxy.enabled. An explicit enabled=True also passes. Documented inline.
  • nft list ruleset without root returns a non-zero exit → warn (we can't distinguish "no ruleset" from "needs root"); the hint says to re-run with sudo.

Ambiguity flags

  1. "iron-proxy enabled" definitionget_status() exposes .enabled (dataclass default False, not populated from config) and .configured (CA + proxy.yaml present). I treated enabled OR configured as PASS so the signal is useful without a full config load. If you want strict config.yaml: proxy.enabled semantics, that's a one-line change.
  2. harden as its own command tree vs. a doctor check — per the brief I kept it a separate command (hermes egress harden), NOT a new doctor check, since it surveys the host rather than the proxy. They're complementary.
  3. Baseline OR-groups — "any firewall" is modeled as an OR-group; when none pass, the missing list reports all acceptable members so the operator sees every valid fix. Open to reporting just a representative one if that's noisy.

Cut from scope (known follow-ups)

Documented in hardening-baselines.md under "Future enhancements":

  • Hetzner-specific cloud-firewall detection
  • Cloudflare edge / CIDR DNS caching (not host-detectable from inside the box)
  • legacy iptables -L baseline
  • nftables-vs-iptables differentiation
  • Docker user-namespace remapping (userns-remap)

Attribution

Opened by Bartok at Daniel's request, in response to Teknium's invitation to @catalinmpit for a security review of the egress proxy (NousResearch#30179). Teknium's prompt was essentially "can we get a security review of how Hermes' egress proxy holds up in a real deployment?" — and Catalin's public deployment described the shape this PR encodes as the catalin baseline:

"I've deployed my Hermes agent on a Hetzner VPS — locked behind Tailscale, UFW default-deny, Cloudflare in front, and fail2ban watching SSH. The only thing exposed is what I explicitly allow." — @catalinmpit

This PR turns that ad-hoc perimeter into a first-class, surveyable baseline that composes with the existing sandbox-egress isolation.


Note

Low Risk
Read-only probes and an informational CLI that always exit 0; no changes to auth, egress enforcement, or proxy startup gating.

Overview
Adds hermes egress harden, a read-only host security survey that complements hermes egress doctor: it probes perimeter controls (Tailscale, UFW/firewalld/nftables, fail2ban, SSH settings, Docker seccomp) and two iron-proxy runtime checks via get_status(), then reports pass/fail/warn/skip with optional remediation hints.

New agent/proxy_sources/host_hardening.py implements ten stdlib-only probes (4s timeouts, missing tools → skip), survey_host(), and baseline evaluation with OR-groups for “any firewall” on minimal, catalin, and paranoid. hermes_cli/proxy_cli.py wires the subcommand with --baseline, --json, and --all (gaps-only by default); the command always exits 0 (informational, does not gate deploys).

25 hermetic tests in tests/test_iron_proxy_harden.py and docs (hardening-baselines.md, CLI reference, iron-proxy cross-link) document signals and baselines.

Reviewed by Cursor Bugbot for commit 1b921b1. Bugbot is set up for automated code reviews on this repo. Configure here.

Add `hermes egress harden`: a read-only host-hardening survey that detects
10 perimeter-security signals (Tailscale, UFW, firewalld, nftables,
fail2ban, SSH password/root-login config, Docker seccomp) plus the two
iron-proxy runtime signals, and shows how they layer with iron-proxy's
sandbox-egress hardening.

- New module agent/proxy_sources/host_hardening.py (stdlib-only, no side
  effects, graceful skip on missing binaries / non-Linux hosts).
- New CLI subcommand `hermes egress harden` with --baseline
  {minimal,catalin,paranoid}, --json, --all. Always exits 0 (informational).
- 25 hermetic tests in tests/test_iron_proxy_harden.py.
- New docs page hardening-baselines.md + links from iron-proxy.md and
  cli-commands.md.

Complementary to `hermes egress doctor` (proxy health) — this answers
'is the host the proxy runs on locked down?'. Inspired by @catalinmpit's
Hetzner+Tailscale+UFW+fail2ban deployment that prompted Teknium's
security-review question.

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Fix All in Cursor

Bugbot Autofix prepared fixes for both issues found in the latest run.

  • ✅ Fixed: Table shows redundant status text instead of detail
    • The harden table now labels the column as Detail and renders each signal's human-readable detail text.
  • ✅ Fixed: Substring match falsely detects stopped firewalld as running
    • The firewalld probe now requires exit code 0 and an exact running state before reporting pass.

Create PR

Or push these changes by commenting:

@cursor push 4a0184a6d6
Preview (4a0184a6d6)
diff --git a/agent/proxy_sources/host_hardening.py b/agent/proxy_sources/host_hardening.py
--- a/agent/proxy_sources/host_hardening.py
+++ b/agent/proxy_sources/host_hardening.py
@@ -191,7 +191,8 @@
             name, SKIP, "firewalld not installed",
             "install firewalld or use ufw/nftables instead",
         )
-    if "running" in out.strip().lower():
+    state = out.strip().lower()
+    if rc == 0 and state == "running":
         return HardeningSignal(name, PASS, "running")
     return HardeningSignal(
         name, FAIL, f"not running ({out.strip() or 'unknown'})",

diff --git a/hermes_cli/proxy_cli.py b/hermes_cli/proxy_cli.py
--- a/hermes_cli/proxy_cli.py
+++ b/hermes_cli/proxy_cli.py
@@ -830,13 +830,13 @@
                   padding=(0, 1))
     table.add_column("", width=2)
     table.add_column("Signal", style="cyan", no_wrap=True)
-    table.add_column("Status")
+    table.add_column("Detail")
     table.add_column("Action")
     for s in shown:
         table.add_row(
             _glyph.get(s.status, "?"),
             s.name,
-            s.status,
+            s.detail,
             s.fix or "\u2014",
         )
     if shown:

diff --git a/tests/test_iron_proxy_harden.py b/tests/test_iron_proxy_harden.py
--- a/tests/test_iron_proxy_harden.py
+++ b/tests/test_iron_proxy_harden.py
@@ -90,6 +90,13 @@
     assert s.status == hh.PASS
 
 
+def test_firewalld_not_running_fails(monkeypatch, no_iron_proxy):
+    _force_linux(monkeypatch)
+    _patch_run(monkeypatch, {"firewall-cmd": (252, "not running\n", "")})
+    s = _signal(hh.survey_host(), "firewalld")
+    assert s.status == hh.FAIL
+
+
 def test_nftables_pass(monkeypatch, no_iron_proxy):
     _force_linux(monkeypatch)
     _patch_run(monkeypatch, {
@@ -293,6 +300,30 @@
     assert "ssh-password-auth" not in default_out
 
 
+def test_table_shows_signal_detail(monkeypatch, capsys):
+    import argparse
+    from hermes_cli import proxy_cli
+
+    signals = [
+        hh.HardeningSignal(
+            "firewalld",
+            hh.FAIL,
+            "not running (not running)",
+            "systemctl enable --now firewalld",
+        ),
+    ]
+    monkeypatch.setattr(proxy_cli.hh, "survey_host", lambda baseline: signals)
+    monkeypatch.setattr(
+        proxy_cli.hh,
+        "baseline_status",
+        lambda signals, baseline: (False, ["firewalld"]),
+    )
+    proxy_cli.cmd_harden(argparse.Namespace(
+        baseline="minimal", as_json=False, show_all=False))
+    out = capsys.readouterr().out
+    assert "not running (not running)" in out
+
+
 # ---------------------------------------------------------------------------
 # Platform + missing-file graceful degradation
 # ---------------------------------------------------------------------------

You can send follow-ups to the cloud agent here.

Reviewed by Cursor Bugbot for commit 1b921b1. Configure here.

Comment thread hermes_cli/proxy_cli.py
table.add_row(
_glyph.get(s.status, "?"),
s.name,
s.status,

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Table shows redundant status text instead of detail

Medium Severity

The cmd_harden table renders s.status (bare text like "fail"/"warn"/"skip") in the third column, which is entirely redundant with the glyph already in column 1. The s.detail field — containing the actually useful human explanation like "ufw installed but inactive" or "BackendState=Running" — is never displayed in non-JSON mode. The sibling cmd_doctor handler correctly shows c.detail in the equivalent position.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 1b921b1. Configure here.

"install firewalld or use ufw/nftables instead",
)
if "running" in out.strip().lower():
return HardeningSignal(name, PASS, "running")

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Substring match falsely detects stopped firewalld as running

High Severity

The check "running" in out.strip().lower() is a substring match that incorrectly evaluates to True when firewall-cmd --state outputs "not running" to stdout (which it does with exit code 252 when the daemon is stopped). This causes the signal to report PASS when firewalld is actually not running, completely defeating the purpose of the firewall probe.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 1b921b1. Configure here.

@Bartok9

Bartok9 commented May 30, 2026

Copy link
Copy Markdown
Owner Author

Superseded by upstream PR NousResearch#35187 — retargeted to NousResearch/hermes-agent feat/iron-proxy branch with Bugbot fixes baked in. Bartok9/hermes-agent is just a fork; landing on upstream is what we want.

@Bartok9 Bartok9 closed this May 30, 2026
Bartok9 added a commit that referenced this pull request Jun 6, 2026
…NousResearch#34192) (NousResearch#34382)

NousResearch#34192 reports Hostinger's 'Hermes WebUI' catalog crashes on startup
with:

  /usr/bin/tini: No such file or directory

The image moved from tini to s6-overlay as PID 1 (/init) earlier in
2026. Orchestration templates that still pin /usr/bin/tini as the
entrypoint \u2014 like the Hostinger Hermes WebUI catalog \u2014 have no
binary to exec and the container crashes immediately.

Hermes has no control over the Hostinger catalog template, but we can
make the image backward-compatible by symlinking /usr/bin/tini -> /init
during the s6-overlay install step. External wrappers that exec
/usr/bin/tini will land on the same s6-overlay reaper they would have
landed on if they'd used the canonical /init entrypoint.

The image's own ENTRYPOINT continues to be /init verbatim \u2014 the shim
is purely for legacy external wrappers, not for the image's own
runtime path. Once affected catalogs are updated, the symlink can be
removed.

Other issues NousResearch#34192 raises that are NOT addressed by this PR:

  * Problem #2 (UID 1024 vs 10000 mismatch): already fixed by NousResearch#33148
    (S6_KEEP_ENV=1) and NousResearch#32412 (with-contenv shebangs). The Hostinger
    template likely needs to update its env-var propagation.

  * Problem #3 (incompatible session formats): RFC for pluggable
    SessionDB is tracked in NousResearch#23717.

  * Problem #4 (Telegram polling conflict): an operations problem on
    Hostinger's side, not in this codebase.

This PR is scoped to the one issue that can be fixed inside
Dockerfile: the missing /usr/bin/tini binary.

Tests (3 in test_dockerfile_tini_compat_shim.py):

  - test_tini_compat_symlink_present
    Guard: the symlink line must exist in Dockerfile.
  - test_tini_compat_comment_explains_why
    The NousResearch#34192 anchor comment must be present so future readers know
    why the shim is there (avoid accidental removal).
  - test_entrypoint_still_init_not_tini
    Sanity check: ENTRYPOINT remains /init (s6-overlay). The shim is
    only for external wrappers.

Refs: NousResearch#34192
Partial fix: addresses the immediate tini-binary crash. Catalog-side
fixes still needed by Hostinger for the UID and session-format
problems documented in the issue.

Co-authored-by: Cursor <cursoragent@cursor.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant