feat(egress): host hardening survey — follow-up to #30179#35187
Open
Bartok9 wants to merge 1 commit into
Open
Conversation
Read-only survey of host perimeter controls — firewall (UFW / firewalld /
nftables), Tailscale, fail2ban, SSH config, Docker seccomp — alongside two
iron-proxy runtime signals (via the existing `get_status()`). Shows the
whole defense-in-depth stack in one table.
`hermes egress harden` is informational; it always exits 0 and never gates
a deploy. The `--baseline` flag (`minimal` / `catalin` / `paranoid`)
colors the summary line only; the `--all` flag includes passing signals
in the table; `--json` emits a stable schema for SIEM/dashboard ingest.
The 10 signals: tailscale, ufw, firewalld, nftables, fail2ban,
ssh-password-auth, ssh-root-login, iron-proxy-enabled, iron-proxy-running,
docker-seccomp. Each probe is best-effort with a 4 s timeout — missing
binary / non-Linux host → skip (not fail), so the survey never wedges.
Stdlib only, no new dependencies. Reuses `get_status()` from the
iron-proxy core (the only touch outside the new module).
Inspired by @catalinmpit's Hetzner+Tailscale+UFW+Cloudflare+fail2ban
deployment that prompted Teknium's "secure hermes" question on X — the
`catalin` baseline encodes that posture as a first-class target.
Bugbot review fixes baked in (caught on the original fork PR, retargeted
to upstream here as one clean commit):
* firewalld substring match -> exact-equality match
"running" in "not running".lower() was True, falsely reporting PASS
when firewalld is stopped (high severity). Regression test added:
test_firewalld_stopped_is_fail_not_pass.
* cmd_harden table column 3 now shows s.detail (e.g. "ufw installed
but inactive") instead of s.status (which is already rendered as a
glyph in column 1). Mirrors cmd_doctor. Regression test added:
test_cmd_harden_table_shows_detail_not_status.
Validation:
27 passed in 1.23 s (tests/test_iron_proxy_harden.py)
Author: Bartok9 (Daniel Pike), opened in response to Teknium's invitation
to @catalinmpit for a security review of NousResearch#30179. Complements NousResearch#30179
(sandbox-egress isolation) by surveying the perimeter side of the stack.
Contributor
Author
Polish — improved description bullets• |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why this complements #30179
#30179 introduced the iron-proxy sandbox-egress layer (swap opaque proxy tokens for real upstream creds at the network boundary, so a prompt-injected agent never sees real keys). That stops credential exfil from inside the sandbox.
This PR adds the host-perimeter side of the picture.
hermes egress hardenis a read-only survey that probes the host's firewall (UFW / firewalld / nftables), Tailscale, fail2ban, SSH config, and Docker seccomp — alongside two iron-proxy runtime signals viaget_status()— so an operator sees the whole defense-in-depth stack in one table.The two layers solve different threats; neither substitutes for the other:
OPENAI_API_KEYto an attacker. iron-proxy stops that.A companion CA-rotation PR (
hermes egress rotate-ca) is open separately against the same base.What lands
agent/proxy_sources/host_hardening.py— stdlib-only, no side effects, every probe graceful (missing binary / non-Linux host →skip, neverfail). The only touch to the existing iron-proxy code is afrom .iron_proxy import get_statusfor the two runtime signals.hermes egress harden(one sub-parser entry + one handler inhermes_cli/proxy_cli.py):--baseline {minimal,catalin,paranoid}(defaultminimal)--json→{"signals":[{name,status,detail,fix}],"baseline":str,"satisfied":bool,"missing":[str]}--all→ show passing signals too (default: gaps-only)tests/test_iron_proxy_harden.py(mirrorstest_iron_proxy_doctor.pystyle if/when doctor lands).website/docs/user-guide/egress/hardening-baselines.md+ a "Host hardening" section iniron-proxy.mdand ahardenblock incli-commands.md.The 10 signals
tailscaletailscale status --json→BackendState=Runningufwufw status verbose→ active + default-deny incomingfirewalldfirewall-cmd --state→ exact match "running" (see Bugbot fix below)nftablesnft list ruleset→ non-emptyfail2banfail2ban-client status→ ≥1 jailssh-password-auth^PasswordAuthentication noin sshd_configssh-root-loginiron-proxy-enabledget_status()iron-proxy-runningget_status()(pid alive + listening)docker-seccompdocker infoSecurityOptions includesseccompBaselines
Color the summary line only — never force a fail:
Failure modes considered
_run()helper; a hang →skip, never a wedged survey.ufw / firewalld / nftables / fail2banskip cleanly on macOS; Tailscale / SSH-config / iron-proxy / Docker still run./etc/ssh/sshd_config— both SSH signalsskip(notfail); many minimal containers don't run sshd.get_status()raising — wrapped; the iron-proxy signals degrade toskip, never crash the survey.tailscalenon-JSON →warn;nft/dockernon-zero exit →warnwith an actionable hint.Bugbot review fixes baked in
Two issues Cursor Bugbot caught on the original cross-fork iteration, fixed inline here (no separate cleanup PR needed):
🔴 High severity —
firewalldsubstring match false-positives a stopped daemon"running" in out.strip().lower()evaluatesTruefor"not running"— meaning the probe would report PASS for a stopped firewalld, the worst-possible failure mode for a hardening probe. Fix: exact-equality match on the stripped output. Regression test added:test_firewalld_stopped_is_fail_not_pass.🟡 Medium severity — table column 3 showed redundant status text
The third column rendered
s.status("fail"/"warn"/"skip") which is already encoded in the glyph in column 1, while the usefuls.detail("ufw installed but inactive", "BackendState=Running") was never shown in non-JSON mode. Fix: column renamed to "Detail" and now renderss.detail. Regression test added:test_cmd_harden_table_shows_detail_not_status.Validation
The 1 skip is the existing E2E test gated behind
HERMES_RUN_E2E=1(unchanged).feat/iron-proxy)tests/test_iron_proxy_harden.py)hermes egress harden --jsonruns clean (exit 0) on a macOS dev host with Linux-only signals degrading toskip— confirms graceful cross-platform behavior.Coverage gaps
enabledsignal treats a generated CA +proxy.yamlas "enabled" becauseget_status()doesn't readconfig.yaml'sproxy.enabled. An explicitenabled=Truealso passes. Documented inline.nft list rulesetwithout root returns a non-zero exit →warn(we can't distinguish "no ruleset" from "needs root"); the hint says to re-run withsudo.Ambiguity flags
get_status()exposes.enabled(dataclass defaultFalse, not populated from config) and.configured(CA + proxy.yaml present). I treatedenabled OR configuredas PASS so the signal is useful without a full config load. If you want strictconfig.yaml: proxy.enabledsemantics, that's a one-line change.hardenas its own command vs. a doctor check — kept as a separate command tree (hermes egress harden). They're complementary:doctorchecks the proxy is healthy;hardenchecks the host is locked down.missinglist reports all acceptable members so the operator sees every valid fix. Open to reporting just a representative one if that's noisy.Cut from scope (known follow-ups)
Documented in
hardening-baselines.mdunder "Future enhancements":iptables -Lbaselineuserns-remap)Attribution
Opened by Bartok9 (Daniel Pike) at Daniel's request, in response to Teknium's invitation on X to @catalinmpit for a security review of #30179. Teknium's prompt was essentially "can we get a security review of how Hermes' egress proxy holds up in a real deployment?" — and Catalin's public deployment described the shape this PR encodes as the
catalinbaseline:This PR turns that ad-hoc perimeter into a first-class, surveyable baseline that composes with the existing sandbox-egress isolation.