Skip to content

Commit ea5e937

Browse files
committed
fix(egress): address PR review findings — P0/P1/P2/P3 + CI greens
P0 — must-fix - iron_proxy: emit default upstream_deny_cidrs (loopback, IMDS 169.254.0.0/16, RFC1918) when caller passes None. Honours the docs promise that cloud-metadata IPs are refused regardless of allowlist. - iron_proxy: bind 127.0.0.1 (+ docker0 bridge IP on Linux) instead of INADDR_ANY (':9090'). LAN peers with a leaked sandbox token could otherwise spend the operator's API quota against any allowlisted upstream. - ensure_ca_cert: write the CA private key via os.open(..., 0o600) instead of shutil.copy2+os.chmod — closes the TOCTOU window where the key existed under the default umask. - discover_uncovered_providers + proxy.fail_on_uncovered_providers config: refuse to start (when strict) if env vars for non-bearer providers (Anthropic native x-api-key, AWS SigV4, Azure OpenAI, etc.) are present. Surfaces a wizard warning in non-strict mode. P1 — should-fix - start_proxy: build a minimal subprocess env (PATH/HOME/locale + only the env names referenced by mappings) instead of os.environ .copy(). Strips proxy-recursion vars (HTTPS_PROXY etc.). Stops the proxy's /proc/<pid>/environ from leaking every host secret to same-uid local processes. - start_proxy: optional Bitwarden refresh path (refresh_secrets_from_bitwarden=True, bitwarden_config=...). When credential_source=bitwarden, cmd_start wires it in — that's what delivers the rotation guarantee the docs make. - build_proxy_config: wire audit_log into the rendered yaml (log.audit_path). Parameter was accepted but never used. - ensure_audit_log: pre-create the audit log with 0o600 perms so iron-proxy inherits tight permissions instead of relying on umask. - Rename 'hermes proxy ...' → 'hermes egress ...' in user-facing strings (docstring, RuntimeError messages, post-setup banner). - start_proxy: open log file with 0o600 perms and close the parent fd immediately after Popen — fixes the per-restart fd leak. - DockerEnvironment: detect collisions between docker_env and the egress-controlling env vars (HTTPS_PROXY, SSL_CERT_FILE, etc.). When enforce_on_docker=true, fail loud rather than silently inverting the isolation; when false, warn and let docker_env win. - proxy_cli: merge_mappings preserves existing tokens on re-setup; --rotate-tokens flag re-mints all of them. Stops re-running `hermes egress setup` from invalidating tokens baked into already-running sandboxes. - proxy_cli: --from-bitwarden fail-loud on disabled BW config, missing access token, or empty vault. Previously fell through to the env path while still writing credential_source: bitwarden. - docker.py: narrow `except Exception` → `except ImportError`; iron_proxy._read_tunnel_port_from_config: same. Bare excepts were masking real config-load bugs. - start_proxy: write pidfile via os.open with O_NOFOLLOW + 0o600 + st_uid check. Refuses to follow a pre-existing symlink at the pidfile path. - mint_proxy_token docstring: document the 128-bit suffix entropy explicitly (sha256 truncated to 32 hex chars). P2 — follow-up - start_proxy: poll-with-timeout (100ms cadence on _port_listening) instead of an unconditional 5s sleep. Saves several seconds per Docker container create when enforce_on_docker=true. - docker.py: apply enforce_on_docker semantics when CA file vanishes between status.configured check and CA mount. Previously returned empty args silently. - docker.py: refuse to mount when mappings.json is empty/corrupt (was indistinguishable from upstream outage from inside the sandbox). - install_iron_proxy: tarfile.extract(..., filter='data') to silence the PEP 706 deprecation and opt into the 3.14+ default. - _proxy_state_dir: chmod 0o700 unconditionally; add _proxy_state_dir_ro() so read-only callers don't create the dir. - stop_proxy: re-verify pid before SIGKILL via /proc/<pid>/stat starttime AND _pid_alive. Prevents SIGKILL'ing a recycled pid. - _pid_alive: tightened cmdline check — basename match on argv[0] plus an in-process nonce env var ('iron-proxy' in cmdline matched 'tail iron-proxy.log' and editors with the log open). - docker.py: NODE_OPTIONS=--use-openssl-ca so Node.js routes through the OpenSSL CA store SSL_CERT_FILE controls, narrowing the Python/curl-replace vs Node-add asymmetry waefrebeorn flagged. P3 — polish - proxy_cli: dest='egress_command' (was 'proxy_command' which collided lexically with the inbound OAuth subparser). - iron_proxy_version: cache by binary path — get_status is called per Docker container create, version is constant per binary. - Drop unused `import sys` from iron_proxy. - proxy_cli: `is not None` check on --tunnel-port (was treating 0 as falsy and silently substituting the default). - proxy_cli cmd_disable: use get_status().pid instead of reaching into ip._read_pid() (stale pidfile from a crashed run would have fired a spurious "still running" warning). - Tests: replace hardcoded /tmp/ca.* paths with tmp_path-derived fixtures so tests are hermetic across hosts. CI - Windows footguns scanner: os.kill(pid, 0) is now gated behind platform.system() != 'Windows' with a windows-footgun: ok marker; signal.SIGKILL falls back to SIGTERM on Windows via getattr(signal, 'SIGKILL', signal.SIGTERM). - docs MDX compilation: replace bare `<https://…>` URLs with `[text](url)` syntax (MDX-jsx parser rejects the angle-bracket form). Tests - 32 new tests covering default deny CIDRs, bind policy, audit log wiring, subprocess env minimization, CA TOCTOU 0o600, state dir 0o700, empty-mappings refusal, CA-vanished refusal, docker_env collision detection, token preservation/rotate, uncovered provider detection, and the proxy_cli command handlers + argparse wiring. - All 156 tests in test_iron_proxy + test_iron_proxy_cli + test_docker_environment + test_config pass locally. Acknowledged but not addressed in this revision - E2E test for HTTPS CONNECT + TLS-MITM path: existing E2E exercises plain HTTP; full MITM coverage needs separate CI infra (real iron- proxy binary + curl with custom CA). Tracked as follow-up. - Cosign-style supply-chain verification for the binary checksum: upstream iron-proxy doesn't sign releases yet. Accepted pattern (same as Bitwarden integration); tracked as follow-up. - CA rotation CLI (`hermes egress rotate-ca`): scope-cut to a follow-up. Reviewers: @annguyenNous @waefrebeorn @GodsBoy @erhnysr
1 parent b495996 commit ea5e937

8 files changed

Lines changed: 1989 additions & 130 deletions

File tree

agent/proxy_sources/iron_proxy.py

Lines changed: 704 additions & 64 deletions
Large diffs are not rendered by default.

hermes_cli/config.py

Lines changed: 15 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1787,7 +1787,7 @@ def _ensure_hermes_home_managed(home: Path):
17871787
# real API credentials at the egress boundary. Compromising the sandbox
17881788
# leaks tokens that only work from behind the proxy.
17891789
#
1790-
# Configure with `hermes proxy setup`. Disabled by default — the rest of
1790+
# Configure with `hermes egress setup`. Disabled by default — the rest of
17911791
# Hermes works exactly as before with `enabled: false`.
17921792
"proxy": {
17931793
# Master switch. When false, iron-proxy is never started, no docker
@@ -1811,6 +1811,20 @@ def _ensure_hermes_home_managed(home: Path):
18111811
# proxy is enabled but not running. False = fall back to direct
18121812
# outbound with real credentials in the sandbox (the legacy posture).
18131813
"enforce_on_docker": True,
1814+
# When true, `hermes egress start` refuses to start if any provider
1815+
# env var is set that the proxy cannot strip (Anthropic native
1816+
# `x-api-key`, AWS Bedrock SigV4, Azure OpenAI, etc.). These
1817+
# credentials would otherwise leak into the sandbox bypassing the
1818+
# proxy. Defaults to false because the false-positive cost
1819+
# (operator has the env set but doesn't actually use that provider)
1820+
# is higher than the security cost of a warning.
1821+
"fail_on_uncovered_providers": False,
1822+
# SSRF deny list applied to outbound traffic. Omit / leave empty
1823+
# to use the safe default: loopback, link-local (incl. cloud
1824+
# metadata IPs at 169.254.169.254), and RFC1918. Set to an
1825+
# explicit ``[]`` to opt out entirely (only sensible in hermetic
1826+
# tests that need to reach a loopback upstream).
1827+
"upstream_deny_cidrs": None,
18141828
# Extra allowed upstream hosts beyond the bundled defaults (which
18151829
# cover OpenRouter, OpenAI, Anthropic, Google, xAI, Mistral, Groq,
18161830
# Together, DeepSeek, Nous). Wildcards (`*.foo.com`) are supported.

hermes_cli/main.py

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11093,7 +11093,9 @@ def _dispatch_secrets(args): # noqa: ANN001
1109311093
_proxy_cli.register_cli(egress_parser)
1109411094

1109511095
def _dispatch_egress(args): # noqa: ANN001
11096-
sub = getattr(args, "proxy_command", None)
11096+
# The egress subparser uses dest='egress_command' to stay disjoint
11097+
# from the inbound OAuth ``hermes proxy`` subparser (dest='proxy_command').
11098+
sub = getattr(args, "egress_command", None)
1109711099
if sub is not None and hasattr(args, "func") and args.func is not _dispatch_egress:
1109811100
return args.func(args)
1109911101
egress_parser.print_help()

hermes_cli/proxy_cli.py

Lines changed: 206 additions & 39 deletions
Large diffs are not rendered by default.

0 commit comments

Comments
 (0)