|
| 1 | +--- |
| 2 | +sidebar_position: 14 |
| 3 | +title: "Egress proxy internals" |
| 4 | +description: "How the iron-proxy egress firewall integrates with Hermes — module layout, lifecycle, security invariants, and extension points" |
| 5 | +--- |
| 6 | + |
| 7 | +# Egress proxy internals |
| 8 | + |
| 9 | +This page covers the architecture of the egress credential-injection firewall (`hermes egress` / iron-proxy) from a contributor / plugin author's perspective. End-user setup + usage docs live at [Egress proxy](../user-guide/egress/iron-proxy.md). |
| 10 | + |
| 11 | +The threat model and high-level design are summarised on the user page; this page is about *how* it's wired, where the security-relevant code lives, and what invariants you have to preserve if you touch it. |
| 12 | + |
| 13 | +## Module layout |
| 14 | + |
| 15 | +```text |
| 16 | +agent/proxy_sources/iron_proxy.py Core: binary install, CA gen, config build, |
| 17 | + subprocess lifecycle, mappings I/O, PID/nonce |
| 18 | + defense. Pure-function surface where possible. |
| 19 | +
|
| 20 | +hermes_cli/proxy_cli.py Wizard + slash command handlers. |
| 21 | + `hermes egress {install,setup,start,stop, |
| 22 | + status,disable,config}`. Wires the |
| 23 | + core module into argparse. |
| 24 | +
|
| 25 | +hermes_cli/main.py:_dispatch_egress Top-level subparser dispatcher. |
| 26 | + dest='egress_command' (intentionally |
| 27 | + disjoint from the inbound OAuth |
| 28 | + `hermes proxy` subparser, which uses |
| 29 | + dest='proxy_command'). |
| 30 | +
|
| 31 | +hermes_cli/config.py: proxy schema The `proxy:` block in DEFAULT_CONFIG. |
| 32 | + Adding a knob means: add it here, add a |
| 33 | + wizard prompt or `setdefault` in |
| 34 | + proxy_cli.cmd_setup, and document it |
| 35 | + in the user-guide page. |
| 36 | +
|
| 37 | +tools/environments/docker.py |
| 38 | + _egress_proxy_args_for_docker() Builds the volume_args / env_overrides / |
| 39 | + host_args triple that the Docker backend |
| 40 | + injects when `proxy.enabled: true`. |
| 41 | +
|
| 42 | + DockerEnvironment.__init__ Docker-side merge logic: collision |
| 43 | + detection against critical egress vars, |
| 44 | + NODE_OPTIONS append-merge via the |
| 45 | + _HERMES_EGRESS_NODE_OPTIONS_APPEND |
| 46 | + sentinel, enforce_on_docker precedence. |
| 47 | +
|
| 48 | +tests/test_iron_proxy.py Hermetic tests (~70). Binary install |
| 49 | + path, config build, mappings I/O, |
| 50 | + subprocess lifecycle, docker arg builder, |
| 51 | + deny CIDR defaults, bind policy, CA |
| 52 | + TOCTOU, ensure_audit_log behaviour, etc. |
| 53 | +
|
| 54 | +tests/test_iron_proxy_cli.py CLI handler unit tests (~20). Argparse |
| 55 | + wiring, fail-loud paths, BWS refresh |
| 56 | + wire-up, dest='egress_command' |
| 57 | + regression guard. |
| 58 | +
|
| 59 | +tests/test_iron_proxy_e2e.py Live E2E (gated on HERMES_RUN_E2E=1). |
| 60 | + Real iron-proxy binary, real curl, |
| 61 | + end-to-end token swap verified. |
| 62 | +``` |
| 63 | + |
| 64 | +## Lifecycle |
| 65 | + |
| 66 | +```text |
| 67 | +hermes egress install |
| 68 | + -> agent.proxy_sources.iron_proxy.install_iron_proxy(force=...) |
| 69 | + Downloads pinned tarball + checksums.txt from GitHub Releases. |
| 70 | + SHA-256 verification before extraction. |
| 71 | + tarfile.extract(..., filter="data") on Python 3.12+ (PEP 706); |
| 72 | + falls back to plain extract on older Python with member-name |
| 73 | + sanitisation via _pick_tar_member. |
| 74 | + Stage into ~/.hermes/bin/.iron-proxy_XXXX, chmod 755, os.replace |
| 75 | + to ~/.hermes/bin/iron-proxy (atomic). |
| 76 | + _VERSION_CACHE.pop(target) so a forced reinstall re-probes |
| 77 | + --version on next call. |
| 78 | +
|
| 79 | +hermes egress setup [--from-bitwarden | --no-bitwarden] [--rotate-tokens] |
| 80 | + -> proxy_cli.cmd_setup |
| 81 | + Step 1. find_iron_proxy(install_if_missing=False) -> install if absent. |
| 82 | + Step 2. ensure_ca_cert() |
| 83 | + Run openssl genrsa + req via subprocess. |
| 84 | + Write CA key via os.open(O_WRONLY|O_CREAT|O_TRUNC|O_NOFOLLOW, 0o600) |
| 85 | + + os.replace. Never exists on disk under default umask. |
| 86 | + Write CA cert with 0o644 (public). |
| 87 | + Step 3. discover_provider_mappings() or pull names from BWS via |
| 88 | + fetch_bitwarden_secrets() when --from-bitwarden. |
| 89 | + merge_mappings(existing=load_mappings(), discovered, |
| 90 | + rotate=args.rotate_tokens) preserves prior |
| 91 | + tokens unless --rotate-tokens is passed. |
| 92 | + discover_uncovered_providers() and surface warnings. |
| 93 | + Step 4. ensure_audit_log(audit_log_path) # raises on OSError |
| 94 | + build_proxy_config(...) with defaults applied at the call site |
| 95 | + (deny CIDRs default, bind policy from _default_http_listen). |
| 96 | + write_proxy_config(cfg) # atomic via .tmp + os.replace, 0o600 |
| 97 | + write_mappings(mappings) # atomic, 0o600 |
| 98 | + Step 5. proxy_cfg["enabled"] = True; credential_source preservation logic |
| 99 | + (do NOT silently downgrade bitwarden -> env on re-run); |
| 100 | + save_config(cfg). |
| 101 | +
|
| 102 | +hermes egress start |
| 103 | + -> proxy_cli.cmd_start |
| 104 | + Pre-checks (refuse-start path): |
| 105 | + - proxy.fail_on_uncovered_providers? -> discover_blocked_providers() |
| 106 | + - credential_source=bitwarden? -> pre-validate access_token_env + project_id |
| 107 | + -> iron_proxy.start_proxy( |
| 108 | + refresh_secrets_from_bitwarden=..., |
| 109 | + bitwarden_config=..., |
| 110 | + ) |
| 111 | + existing=_read_pid(); if alive, idempotent return. |
| 112 | + _build_proxy_subprocess_env(...): ALLOWLIST + mapped real_env_names, |
| 113 | + strip HTTPS_PROXY/etc. to avoid recursion, optional BWS refresh |
| 114 | + (raises on missing values unless allow_env_fallback=true). |
| 115 | + Plant nonce: _proxy_nonce = sha256(urandom(16)); env[NONCE_ENV] = ... |
| 116 | + Open log_path via O_NOFOLLOW + 0o600 + st_uid check. |
| 117 | + Popen with stdin=DEVNULL, stdout=log_fd, stderr=STDOUT, |
| 118 | + start_new_session=True (POSIX). |
| 119 | + Close parent's log_fd in finally. |
| 120 | + _write_pidfile_safely(pidfile, proc.pid) |
| 121 | + O_EXCL + O_NOFOLLOW + uid check + persisted nonce sidecar. |
| 122 | + FileExistsError -> discriminate live vs stale, retry once if stale. |
| 123 | + Install SIGINT/SIGTERM handlers (main-thread only). |
| 124 | + Poll loop (do-while shape): |
| 125 | + while True: |
| 126 | + if proc.poll() is not None: tail log + unlink pidfile + raise |
| 127 | + if _port_listening("127.0.0.1", tunnel_port): break |
| 128 | + if time.time() >= deadline: break (do-while: checked AFTER first probe) |
| 129 | + time.sleep(0.1) |
| 130 | + If not listening at exit: _kill_and_wait(proc) + unlink pidfile + raise. |
| 131 | +
|
| 132 | +hermes egress stop |
| 133 | + -> iron_proxy.stop_proxy |
| 134 | + _read_pid + _pid_alive guard. |
| 135 | + starttime_before = _pid_proc_starttime(pid) # Linux only; None elsewhere |
| 136 | + os.kill(pid, SIGTERM) |
| 137 | + Wait up to 5s for graceful exit. |
| 138 | + After grace: re-check starttime + _pid_alive. |
| 139 | + If recycled (starttime drift OR _pid_alive False), DO NOT SIGKILL. |
| 140 | + Otherwise os.kill(pid, _KILL_SIGNAL). |
| 141 | + _cleanup_state_files: unlink pidfile + nonce sibling. |
| 142 | +``` |
| 143 | + |
| 144 | +## Security invariants |
| 145 | + |
| 146 | +These are the load-bearing properties. If you touch the module, you must preserve them. Where there's a regression test, it's named. |
| 147 | + |
| 148 | +### Filesystem perms |
| 149 | + |
| 150 | +| Path | Mode | Test | |
| 151 | +|---|---|---| |
| 152 | +| `~/.hermes/proxy/` (dir) | `0o700` | `test_proxy_state_dir_is_0o700` | |
| 153 | +| `ca.key` | `0o600` | `test_ca_key_created_with_0o600` | |
| 154 | +| `ca.crt` | `0o644` | (implicit; chmod call in `ensure_ca_cert`) | |
| 155 | +| `proxy.yaml` | `0o600` | (chmod after atomic rename in `write_proxy_config`) | |
| 156 | +| `mappings.json` | `0o600` | (chmod after atomic rename in `write_mappings`) | |
| 157 | +| `iron-proxy.pid` | `0o600` | (`os.open(..., 0o600)` mode in `_write_pidfile_safely`) | |
| 158 | +| `iron-proxy.nonce` | `0o600` | (`os.open(..., 0o600)` mode in `_write_pidfile_safely`) | |
| 159 | +| `audit.log` | `0o600` | `test_ensure_audit_log_creates_with_0o600` | |
| 160 | +| `iron-proxy.log` | `0o600` | (`os.open(..., 0o600)` + `fchmod`) | |
| 161 | + |
| 162 | +All write paths use `os.open(O_WRONLY | O_CREAT | O_NOFOLLOW, 0o600)` + `os.fstat().st_uid` check. `shutil.copy2` + `os.chmod` is forbidden because it leaks a default-umask window. |
| 163 | + |
| 164 | +### Subprocess env minimisation |
| 165 | + |
| 166 | +`_build_proxy_subprocess_env` MUST NOT use `os.environ.copy()`. The allowlist is `_PROXY_SUBPROCESS_ENV_ALLOWLIST` (PATH, HOME, locale, etc.) plus the env names referenced by `load_mappings()`. Everything else stays on the host. |
| 167 | + |
| 168 | +Regression: `test_subprocess_env_strips_unrelated_secrets`, `test_subprocess_env_strips_proxy_recursion_vars`, `test_subprocess_env_keeps_infrastructure_vars`. |
| 169 | + |
| 170 | +### Bind policy |
| 171 | + |
| 172 | +`_default_http_listen` returns loopback + (Linux only) the docker bridge IP. Never `0.0.0.0`, never `:PORT` (INADDR_ANY). |
| 173 | + |
| 174 | +`_detect_docker_bridge_ip` validates via `ipaddress.IPv4Address` and rejects `is_unspecified` / `is_loopback` / `is_multicast` / `is_reserved` / `is_link_local` / `is_global`. A hostile `ip` shim on PATH cannot inject `0.0.0.0`. |
| 175 | + |
| 176 | +Regression: `test_default_bind_is_loopback_not_zero_zero`, `test_detect_docker_bridge_ip_rejects_dangerous` (parametrized over 8 attack inputs). |
| 177 | + |
| 178 | +### Default deny CIDRs |
| 179 | + |
| 180 | +`_DEFAULT_UPSTREAM_DENY_CIDRS` covers loopback (v4 + v6), link-local (incl. IMDS at 169.254.169.254 and the IPv4-mapped-v6 form), RFC1918, IPv6 ULA, CGNAT, and the RFC2544 benchmark range. `build_proxy_config(..., upstream_deny_cidrs=None)` MUST emit the default; only an explicit empty list opts out. |
| 181 | + |
| 182 | +Regression: `test_default_deny_cidrs_present_when_unspecified`, `test_default_deny_includes_ipv4_mapped_v6`. |
| 183 | + |
| 184 | +### Audit log fail-loud |
| 185 | + |
| 186 | +`ensure_audit_log` raises `RuntimeError` on any `OSError`. Swallowing the failure would let the daemon create the file under the default umask, defeating the privacy promise. `cmd_setup` catches the RuntimeError and surfaces a clear error to the operator. |
| 187 | + |
| 188 | +Regression: `test_ensure_audit_log_raises_on_immutable_parent`. |
| 189 | + |
| 190 | +### Bitwarden mode fail-loud |
| 191 | + |
| 192 | +When `credential_source: bitwarden` AND `proxy.allow_env_fallback: false` (default): |
| 193 | +- Missing access token env var -> `cmd_start` refuses. |
| 194 | +- Missing `project_id` -> `cmd_start` refuses. |
| 195 | +- `bws secret list` returns no values for one or more mapped providers -> `_build_proxy_subprocess_env` raises. |
| 196 | + |
| 197 | +Falling back to host env in BW mode reintroduces exactly the staleness bug the BW path is meant to defeat. |
| 198 | + |
| 199 | +Regression: `test_cmd_start_refuses_when_bitwarden_token_missing` (CLI layer); strict-mode assertions in `_build_proxy_subprocess_env` (daemon layer). |
| 200 | + |
| 201 | +### docker_env collision detection |
| 202 | + |
| 203 | +When `enforce_on_docker: true`, `docker_env` overrides on any of the egress-controlling vars (HTTPS_PROXY, SSL_CERT_FILE, NODE_EXTRA_CA_CERTS, etc.) OR any mapped `real_env_name` (OPENROUTER_API_KEY, etc.) raises `RuntimeError` BEFORE the container starts. |
| 204 | + |
| 205 | +Regression: `test_docker_env_collision_with_proxy_raises_when_enforce`. |
| 206 | + |
| 207 | +### PID recycling defense |
| 208 | + |
| 209 | +`_pid_alive` MUST consult either the in-process `_proxy_nonce` (same-process case) OR the on-disk `iron-proxy.nonce` (cross-CLI case) before trusting an `argv[0]` basename match. `stop_proxy` MUST re-check `/proc/<pid>/stat` starttime before SIGKILL and suppress the signal on starttime drift. |
| 210 | + |
| 211 | +Regression: `test_stop_proxy_suppresses_sigkill_on_pid_recycle`, `test_pid_proc_starttime_parses_comm_with_parens`, `test_persisted_nonce_roundtrip`. |
| 212 | + |
| 213 | +### Token preservation on re-setup |
| 214 | + |
| 215 | +`merge_mappings(existing, discovered, rotate=False)` MUST return prior tokens for providers that overlap. Re-running `hermes egress setup` cannot silently 401 running sandboxes. `--rotate-tokens` is the explicit opt-in. |
| 216 | + |
| 217 | +Regression: `test_merge_mappings_preserves_existing_tokens`, `test_merge_mappings_rotate_mints_fresh_tokens`. |
| 218 | + |
| 219 | +### `credential_source` preservation |
| 220 | + |
| 221 | +`cmd_setup` MUST NOT downgrade `credential_source: bitwarden` to `env` on re-run without an explicit `--no-bitwarden` flag. Running `hermes egress setup` (no flag) preserves whatever was previously configured. |
| 222 | + |
| 223 | +Tested via the `cmd_setup` flow in CLI tests (the bitwarden-preservation path is exercised when `--from-bitwarden` is followed by a plain `setup` re-run). |
| 224 | + |
| 225 | +## Extension points |
| 226 | + |
| 227 | +### Adding a new bearer-token provider |
| 228 | + |
| 229 | +`_BEARER_PROVIDERS` in `iron_proxy.py` maps env var name -> tuple of upstream hosts. Adding an entry makes it discoverable by `discover_provider_mappings()`; the wizard mints a token for it automatically when the env var is present. |
| 230 | + |
| 231 | +```python |
| 232 | +_BEARER_PROVIDERS: Dict[str, Tuple[str, ...]] = { |
| 233 | + ..., |
| 234 | + "MY_PROVIDER_API_KEY": ("api.myprovider.com",), |
| 235 | +} |
| 236 | +``` |
| 237 | + |
| 238 | +Also update `_DEFAULT_ALLOWED_HOSTS` so the proxy allows the upstream by default. Run `test_discover_provider_mappings_*` to confirm. |
| 239 | + |
| 240 | +### Adding a new non-bearer provider |
| 241 | + |
| 242 | +If the provider uses `x-api-key` / SigV4 / OAuth-from-SDK / etc., iron-proxy's `secrets` transform cannot swap it. Add the env var to `_NON_BEARER_PROVIDERS` so the wizard warns about it. If the provider is LLM-specific enough that you want `fail_on_uncovered_providers: true` to actually block it, also add to `_LLM_SPECIFIC_NON_BEARER_PROVIDERS`. |
| 243 | + |
| 244 | +```python |
| 245 | +_NON_BEARER_PROVIDERS: Tuple[str, ...] = ( |
| 246 | + ..., |
| 247 | + "MY_X_API_KEY_PROVIDER", |
| 248 | +) |
| 249 | + |
| 250 | +_LLM_SPECIFIC_NON_BEARER_PROVIDERS: Tuple[str, ...] = ( |
| 251 | + ..., |
| 252 | + "MY_X_API_KEY_PROVIDER", |
| 253 | +) |
| 254 | +``` |
| 255 | + |
| 256 | +### Wiring iron-proxy into a non-Docker backend |
| 257 | + |
| 258 | +`_egress_proxy_args_for_docker` is Docker-specific. Backends that want similar wiring need their own analogue that: |
| 259 | + |
| 260 | +1. Reads `load_config().get("proxy", {})`; returns empty args if `enabled` is false. |
| 261 | +2. Calls `iron_proxy.get_status()`; surfaces `enforce` semantics on `configured` / `pid` / `listening` / `ca_cert_path` failure paths. |
| 262 | +3. Calls `iron_proxy.load_mappings()`; refuses to mount if empty AND `enforce_on_docker: true`. |
| 263 | +4. Sets the seven env vars (HTTPS_PROXY, NO_PROXY, REQUESTS_CA_BUNDLE, SSL_CERT_FILE, CURL_CA_BUNDLE, NODE_EXTRA_CA_CERTS, HERMES_EGRESS_PROXY) and the per-mapping `HERMES_PROXY_TOKEN_<NAME>` vars. |
| 264 | +5. Distributes the CA cert into the sandbox at a path the runtime will trust (typically `/etc/ssl/certs/hermes-egress-ca.crt`). |
| 265 | +6. Implements collision detection against the user's backend-specific env config. |
| 266 | + |
| 267 | +The Docker implementation is ~150 lines; expect similar volume for Modal / Daytona / SSH. |
| 268 | + |
| 269 | +### Subscribing to per-request audit events |
| 270 | + |
| 271 | +iron-proxy writes line-delimited JSON to `~/.hermes/proxy/audit.log`. A plugin / external watcher can tail the file and react to allowlist denials, secret swaps, or upstream errors. The schema is documented at [docs.iron.sh/audit](https://docs.iron.sh/audit) (link). |
| 272 | + |
| 273 | +## Testing |
| 274 | + |
| 275 | +```bash |
| 276 | +# Hermetic suite (no network, no real binary) |
| 277 | +scripts/run_tests.sh tests/test_iron_proxy.py tests/test_iron_proxy_cli.py |
| 278 | + |
| 279 | +# Live E2E (real binary, real curl, real CONNECT tunnel) |
| 280 | +HERMES_RUN_E2E=1 scripts/run_tests.sh tests/test_iron_proxy_e2e.py |
| 281 | + |
| 282 | +# Live PTY smoke against `hermes egress` |
| 283 | +HERMES_HOME=/tmp/hermes-egress-test python3 -m hermes_cli.main egress --help |
| 284 | +HERMES_HOME=/tmp/hermes-egress-test python3 -m hermes_cli.main egress setup --help |
| 285 | +``` |
| 286 | + |
| 287 | +The CLI uses argparse, so `--help` is a good first probe for "did my new flag register correctly". |
| 288 | + |
| 289 | +## See also |
| 290 | + |
| 291 | +- User-facing setup + troubleshooting: [Egress proxy](../user-guide/egress/iron-proxy.md) |
| 292 | +- Docker backend internals: [Docker](../user-guide/docker.md) |
| 293 | +- Bitwarden Secrets Manager integration: [`hermes secrets bitwarden`](../user-guide/secrets/bitwarden.md) |
| 294 | +- CLI command reference: [`hermes egress`](../reference/cli-commands.md#hermes-egress) |
| 295 | +- Sandbox-injected environment variables: [Egress proxy (sandbox-injected)](../reference/environment-variables.md#egress-proxy-sandbox-injected) |
0 commit comments