fix(security): fail-closed when client_host is empty in WebSocket localhost check by memosr · Pull Request #15544 · NousResearch/hermes-agent

memosr · 2026-04-25T07:42:21Z

What does this PR do?

Four WebSocket endpoints in hermes_cli/web_server.py had a fail-open
localhost check:

# Before (vulnerable)
client_host = ws.client.host if ws.client else ""
if client_host and client_host not in _LOOPBACK_HOSTS:
    await ws.close(code=4403)
    return

The if client_host and ... short-circuits when client_host is an
empty string. This happens when:

WebSocket arrives via a reverse proxy (nginx, Caddy, Cloudflare)
that strips the original client info
Connection metadata is unavailable for any reason

In those cases the localhost check is silently bypassed, allowing
remote attackers to connect to endpoints that should be localhost-only:

/api/pty — terminal access
/api/ws — JSON-RPC sidecar (full TUI gateway access)
/api/pub — publish channel
/api/events — event stream

Fix

Made the check fail-closed: if client_host is empty, reject the
connection rather than allowing it through:

# After (safe)
if not client_host or client_host not in _LOOPBACK_HOSTS:
    await ws.close(code=4403)
    return

Applied to all 4 affected WebSocket endpoints.

Type of Change

🔒 Security fix (HIGH — fail-open WebSocket auth)

Checklist

Read the Contributing Guide
Commit messages follow Conventional Commits
Fail-closed default — safer when client metadata is unavailable
Consistent with the fail-closed pattern used in other security checks
No behavior change for legitimate localhost connections

… is empty When client_host is empty (proxy/unknown), reject the WebSocket connection rather than silently allowing it. This prevents potential SSRF via WebSocket connections when the client host cannot be determined. Refs: PR NousResearch#15544

egilewski · 2026-06-05T05:16:05Z

I recommend rework before merge. The security intent is valid: on current main (899ee8c23dfd029fdfd7b669ac3ac82fd8388f55), a synthetic loopback-mode WebSocket with missing or empty ws.client.host still returns allowed from _ws_client_is_allowed, while the PR replay on its base rejects missing/empty peers in /api/pty, /api/ws, /api/pub, and /api/events.

Checked PR head 196b47104ad51afaec938064ede5c0ffbcfa6bce. Focused validation used synthetic WebSocket-shaped objects only, plus python -m pytest -o addopts='' -p no:cacheprovider tests/hermes_cli/test_dashboard_auth_ws_auth.py -q on current main (45 passed) and python -m compileall -q hermes_cli/web_server.py on both current main and the PR replay. CodeRabbit on the uncommitted replay against current main reported No findings.

The blocker is current-main drift: git apply --check for this patch fails at hermes_cli/web_server.py:2390, and current main now routes these checks through _ws_client_is_allowed / _ws_request_is_allowed. Please rebase and port the fail-closed empty-peer hardening into that helper flow so all four WebSocket endpoints inherit it consistently.

Signed: GPT-5.5-xhigh in Codex

@egilewski

Per @egilewski's audit on this PR (NousResearch#15544), the original fix was correct but the file has refactored since: the four endpoint-local empty-peer checks have been consolidated into _ws_client_is_allowed and _ws_client_reason, but the helpers were left fail-open ('no peer host known means allow' / 'no reason to block'). On a loopback-bound dashboard with auth disabled, an ASGI server behind a misconfigured proxy or a unix-socket transport can deliver ws.client == None or ws.client.host == ''. The helpers were treating that as 'allowed', so the loopback-only peer gate could be bypassed by anything that suppressed the client tuple in transit. All four WebSocket endpoints (/api/pty, /api/ws, /api/pub, /api/events) route through _ws_request_is_allowed -> _ws_client_is_allowed, so the gap applied uniformly. Fix: * _ws_client_is_allowed: return False when client_host is empty instead of True. Only reached on loopback bind with auth disabled (auth_required=True and explicit non-loopback binds short-circuit earlier), so the fail-closed behavior is scoped to the surface that needs it. * _ws_client_reason: return a 'missing_or_empty_peer bound=...' block reason instead of None, so the dispatcher's existing reason-based rejection path picks it up and the close gets logged with a machine-parseable token for diagnosability. Behavior unchanged for: * gated mode (auth_required=True) — early-returns True before the empty-peer check runs. The OAuth ticket is the auth at that point. * explicit non-loopback bind (--host 0.0.0.0/::, or a specific LAN address, always with --insecure) — early-returns True before the empty-peer check runs. DNS-rebinding is still blocked by the Host/Origin guard in _ws_host_origin_is_allowed. * legitimate loopback peers (client_host == '127.0.0.1' / '::1') — not affected by the empty-peer branch. Regression tests added in tests/hermes_cli/test_dashboard_auth_ws_auth.py: * test_empty_client_host_rejected_in_loopback_mode * test_missing_client_object_rejected_in_loopback_mode * test_empty_client_host_reason_is_block Plus two regression guards to ensure the fix does not over-reach: * test_empty_client_host_still_allowed_in_insecure_public_mode * test_empty_client_host_still_allowed_in_gated_mode All three new fail-closed tests fail without this patch (the helpers return True / None for an empty peer) and pass with it. The 45 pre-existing tests in test_dashboard_auth_ws_auth.py continue to pass.

@egilewski

…ded in TUI shell.exec tui_gateway/server.py exposes a shell.exec JSON-RPC method that runs shell commands via subprocess.run(shell=True). It tries to block dangerous commands using tools.approval.detect_dangerous_command, but on ImportError silently falls through to executing the command: try: from tools.approval import detect_dangerous_command is_dangerous, _, desc = detect_dangerous_command(cmd) if is_dangerous: return _err(rid, 4005, f'blocked: {desc}...') except ImportError: pass # <-- silently disables the guard, then runs anyway If tools.approval cannot be imported (missing module, circular import, syntax error in a dependency, partial deploy), the screener is skipped and arbitrary shell commands reach subprocess.run. Fix: * Replace the silent ImportError pass with a fail-closed return: error code 4006 with message 'dangerous-command guard unavailable; refusing to execute'. Consistent with the fail-closed pattern applied to the cron SSRF check in NousResearch#10180 and the WebSocket empty-peer guard in NousResearch#15544. No behavior change when tools.approval is available -- the existing detect_dangerous_command path runs unchanged, dangerous commands still return 4005, and safe commands still flow through to subprocess.run. Regression tests added in tests/test_tui_gateway_server.py: * test_shell_exec_fails_closed_when_guard_unavailable -- injects an ImportError for tools.approval via a patched builtins.__import__ and a tripwire on subprocess.run; asserts the response is error 4006 with 'guard unavailable' in the message and that subprocess.run is never reached. * test_shell_exec_still_runs_when_guard_available -- guards against over-reach: with tools.approval present and the command not flagged dangerous, shell.exec must still execute. Without this the fail-closed change could regress into 'refuses everything'. Verified locally that the fail-closed test fails without this patch (the response carries error code 5003 from the tripwire-raised AssertionError inside subprocess.run, proving the silent-skip path was reached) and passes with the fix in place. The 205 pre-existing tests in test_tui_gateway_server.py are unaffected by this change. Per @egilewski's audit on this PR (NousResearch#15542), the original April patch targeted the same fail-open path; the file has refactored heavily since (tui_gateway/server.py is now ~8k lines vs ~4.6k at the original PR head), so this commit re-applies the fail-closed behavior at the current shell.exec call site and adds the regression coverage that the original PR was missing.

egilewski · 2026-06-08T10:03:52Z

Looks mergeable from this re-review.

Checked current upstream/main c3055d61857751ad82a2bf9e4f5de5d26a8f2a16 against PR head 225ac495637a6c1aad94e52c45b5b6dc9ecb1e10. The baseline still allows empty/missing ws.client in loopback/no-auth mode, while the PR head rejects both with missing_or_empty_peer and keeps loopback, explicit --insecure public bind, gated mode, and Host mismatch behavior intact.

Validation:

git diff --check upstream/main...HEAD: pass.
git merge-tree --write-tree upstream/main HEAD: pass.
/home/mac/hermes-agent/.venv/bin/python -m compileall -q hermes_cli/web_server.py tests/hermes_cli/test_dashboard_auth_ws_auth.py: pass.
/home/mac/hermes-agent/.venv/bin/python -m pytest -o addopts="" -p no:cacheprovider tests/hermes_cli/test_dashboard_auth_ws_auth.py -q: 50 passed.
GitHub currently reports mergeable=MERGEABLE / mergeStateStatus=CLEAN, with required checks successful or skipped.

Residual note: CodeRabbit could not complete because the service reported its review limit was reached.

Signed: GPT-5.5-xhigh in Codex

alt-glitch added type/security Security vulnerability or hardening P0 Critical — data loss, security, crash loop comp/cli CLI entry point, hermes_cli/, setup wizard labels Apr 25, 2026

memosr force-pushed the fix/tui-gateway-ws-localhost-check branch from 196b471 to 225ac49 Compare June 5, 2026 13:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(security): fail-closed when client_host is empty in WebSocket localhost check#15544

fix(security): fail-closed when client_host is empty in WebSocket localhost check#15544
memosr wants to merge 1 commit into
NousResearch:mainfrom
memosr:fix/tui-gateway-ws-localhost-check

memosr commented Apr 25, 2026

Uh oh!

egilewski commented Jun 5, 2026

Uh oh!

egilewski commented Jun 8, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

memosr commented Apr 25, 2026

What does this PR do?

Fix

Type of Change

Checklist

Uh oh!

egilewski commented Jun 5, 2026

Uh oh!

egilewski commented Jun 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

egilewski commented Jun 8, 2026 •

edited

Loading