Skip to content

Feature/trusted proxy loopback#2181

Open
BingqingLyu wants to merge 4 commits intomainfrom
fork-pr-63379-feature-trusted-proxy-loopback
Open

Feature/trusted proxy loopback#2181
BingqingLyu wants to merge 4 commits intomainfrom
fork-pr-63379-feature-trusted-proxy-loopback

Conversation

@BingqingLyu
Copy link
Copy Markdown
Owner

@BingqingLyu BingqingLyu commented Apr 28, 2026

Summary

  • Problem: When gateway.auth.mode=trusted-proxy with gateway.bind=lan, internal subsystems (browser tool, sub-agents, CLI) connect via loopback (ws://127.0.0.1:18789) and are hard-rejected by trusted_proxy_loopback_source before trustedProxies is consulted. Even with 127.0.0.1 in trustedProxies, loopback connections are always rejected.
  • Why it matters: Browser tool snapshots, sessions_spawn, Discord exec approvals, and all internal GatewayClient connections are completely broken in Kubernetes/Docker deployments using trusted-proxy auth. This blocks the most common containerized deployment pattern (reverse proxy sidecar + trusted-proxy mode).
  • What changed: Added trustedProxy.allowLoopback to bypass the hard loopback rejection, trustedProxy.loopbackUser to auto-inject an operator identity for headerless loopback connections, and skip requiredHeaders checks for internal loopback connections using loopbackUser. Real proxy headers are always preferred when present.
  • What did NOT change (scope boundary): No changes to the GatewayClient URL resolution, browser tool code, or any other internal subsystem. This fix is entirely in the auth layer — internal clients still connect via loopback, but the auth layer now knows how to handle them.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor required for the fix
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

Root Cause (if applicable)

  • Root cause: authorizeTrustedProxy() in src/gateway/auth.ts unconditionally rejects all loopback connections with trusted_proxy_loopback_source before consulting the trustedProxies list. This was introduced as a security hardening measure to prevent sidecars from forging identity headers on loopback, but it also blocks the gateway's own internal subsystems (browser tool, sub-agents, CLI) which legitimately connect from loopback without proxy headers.
  • Missing detection / guardrail: No distinction between external callers on loopback (potentially forging headers) and internal backend callers (the gateway's own subsystems). No config knob to opt in to loopback trust for same-pod deployments.
  • Contributing context: Kubernetes pod networking places all containers (gateway, oauth2-proxy, chromium, terminal) in the same network namespace. Internal connections are always loopback. The bind: lan + trustedProxies: ["${POD_IP}/32"] pattern works for the external proxy but breaks internal self-connections.

Regression Test Plan (if applicable)

  • Coverage level that should have caught this:
    • Unit test
    • Seam / integration test
    • End-to-end test
    • Existing coverage already sufficient
  • Target test or file: src/gateway/auth.test.ts
  • Scenario the test should lock in: 6 new test cases:
    1. Allows loopback when allowLoopback: true (with proxy headers)
    2. Injects loopbackUser when userHeader is missing on loopback
    3. Rejects loopback without userHeader when loopbackUser is not set (fail-closed)
    4. Prefers real userHeader over loopbackUser on loopback
    5. Rejects same-host proxy request with missing required header
    6. Allows same-host proxy request when allowLoopback: true
  • Why this is the smallest reliable guardrail: All three code paths (loopback gate, required headers skip, user injection) are exercised with both positive and negative cases in the existing auth unit test suite.
  • Existing test that already covers this (if any): Existing trusted_proxy_loopback_source test retained — confirms default behavior unchanged.

User-visible / Behavior Changes

  • New optional config fields under gateway.auth.trustedProxy:
    • allowLoopback: boolean (default: false) — allow loopback addresses in trusted-proxy mode
    • loopbackUser: string (optional) — identity to assign to headerless loopback connections
  • No change to default behavior. Both fields are opt-in. Existing deployments are unaffected.

Diagram (if applicable)

Before:
[browser tool] -> ws://127.0.0.1:18789 -> trusted_proxy_loopback_source REJECT (hard, unconditional)

After (allowLoopback: true, loopbackUser: "agent@internal"):
[browser tool] -> ws://127.0.0.1:18789 -> loopback check PASS -> requiredHeaders SKIP -> loopbackUser injected -> authorized as agent@internal

After (allowLoopback: true, with proxy headers):
[oauth2-proxy] -> ws://127.0.0.1:18789 -> loopback check PASS -> requiredHeaders checked -> userHeader read -> authorized as real-user@example.com

Security Impact (required)

  • New permissions/capabilities? YesallowLoopback relaxes the loopback rejection for trusted-proxy mode
  • Secrets/tokens handling changed? No
  • New/changed network calls? No
  • Command/tool execution surface changed? No
  • Data access scope changed? No
  • Risk + mitigation: allowLoopback is opt-in (default false), so existing deployments are unaffected. When enabled, loopback connections are still subject to all other trusted-proxy checks (user header validation, allowUsers gating). loopbackUser only fires when the user header is absent AND the connection is from loopback AND allowLoopback is true — it cannot be triggered from non-loopback sources. Real proxy headers always take precedence over loopbackUser.

Repro + Verification

Environment

  • OS: Linux (EKS us-east-2, Azure AKS)
  • Runtime/container: ghcr.io/openclaw/openclaw:2026.4.2 in K8s StatefulSet
  • Model/provider: Any (kilocode/anthropic/claude-opus-4.6)
  • Integration/channel: Discord, Control UI via oauth2-proxy sidecar
  • Relevant config (redacted):
{
  gateway: {
    bind: "lan",
    trustedProxies: ["${POD_IP}/32"],
    auth: {
      mode: "trusted-proxy",
      trustedProxy: {
        userHeader: "X-Forwarded-Email",
        requiredHeaders: ["x-forwarded-proto", "x-forwarded-host"],
        allowLoopback: true,
        loopbackUser: "agent@internal",
        allowUsers: ["mitch.rosmarin@codiac.io", "agent@internal"],
      },
    },
  },
}

Steps

  1. Deploy OpenClaw to K8s with auth.mode=trusted-proxy, bind=lan, oauth2-proxy sidecar
  2. Ask the agent to browse a URL (triggers browser tool snapshot)
  3. Check logs for trusted_proxy_loopback_source rejection

Expected

  • Browser tool snapshot succeeds; agent returns page content

Actual (before fix)

  • [ws] unauthorized conn=... remote=127.0.0.1 reason=trusted_proxy_loopback_source
  • [tools] browser failed: gateway closed (1008): unauthorized

Evidence

  • Failing test/log before + passing after
  • Trace/log snippets
# Before:
[ws] unauthorized conn=a66bb9d7 remote=127.0.0.1 client=agent backend v2026.4.2 reason=trusted_proxy_loopback_source
gateway connect failed: GatewayClientRequestError: unauthorized
[tools] browser failed: gateway closed (1008): unauthorized
Gateway target: ws://127.0.0.1:18789
Source: local loopback
Bind: lan

# After (unit tests):
✓ allows loopback when allowLoopback is true
✓ injects loopbackUser when userHeader is missing on loopback
✓ rejects loopback without userHeader when loopbackUser is not set
✓ prefers real userHeader over loopbackUser on loopback
✓ rejects same-host proxy request with missing required header
✓ allows same-host proxy request when allowLoopback is true

Human Verification (required)

  • Verified scenarios: All 6 new test cases pass. Existing trusted-proxy tests unchanged and passing. pnpm vitest run src/gateway/auth.test.ts green.
  • Edge cases checked: loopbackUser without allowLoopback (rejected), allowLoopback without loopbackUser (requires real headers), real headers preferred over loopbackUser, IPv6 loopback (::1).
  • What you did NOT verify: Full E2E in K8s deployment (requires building a custom image with this patch). Browser tool snapshot E2E. Discord exec approvals flow.

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

Compatibility / Migration

  • Backward compatible? Yes — both new fields are optional with safe defaults (allowLoopback: false, loopbackUser: undefined). No behavior change without explicit opt-in.
  • Config/env changes? Yes — two new optional fields under gateway.auth.trustedProxy.
  • Migration needed? No — existing configs work unchanged.

Risks and Mitigations

  • Risk: Operator enables allowLoopback without loopbackUser and expects headerless internal connections to work — they'll get trusted_proxy_user_missing.
    • Mitigation: JSDoc on loopbackUser explains the dependency. Could add a startup warning when allowLoopback=true and loopbackUser is unset.
  • Risk: loopbackUser identity could be used to bypass allowUsers restrictions if the operator forgets to add it to the list.
    • Mitigation: allowUsers check runs after user resolution — if loopbackUser is not in allowUsers, the connection is still rejected.

penggaolai and others added 4 commits April 7, 2026 14:00
Allows loopback addresses (127.0.0.1, ::1) in trusted-proxy mode when
explicitly enabled via config. Useful for same-pod or same-host deployments
where agent and gateway share the same network namespace.

- Add allowLoopback?: boolean to GatewayTrustedProxyConfig type
- Add Zod validation for allowLoopback field
- Modify authorizeTrustedProxy to check allowLoopback before rejecting
- Add unit tests for allowLoopback behavior
* main: (522 commits)
  fix(browser): re-check interaction-driven navigations (openclaw#63226)
  test: reuse verbose directive reply imports
  test: reuse exec directive reply imports
  fix(browser): harden browser control override loading (openclaw#62663)
  Matrix: report startup failures as errors
  auth: persist explicit profile upserts directly
  test(doctor): mock memory-core runtime seam
  auth: avoid external cli sync on profile upsert
  feat: parallelize character eval runs
  fix: load QA live provider overrides
  build: stage nostr runtime dependencies
  fix(dotenv): block workspace runtime env vars (openclaw#62660)
  build: narrow plugin SDK declaration build
  test: harden Parallels macOS smoke fallback
  fix(memory): accept embedded dreaming heartbeat tokens
  test: harden provider mock isolation
  docs(config): tighten wording in reference
  test: reuse followup runner imports
  test: reuse image generate tool imports
  Align remote node exec event system messages with untrusted handling (openclaw#62659)
  ...
…nternal connections

When auth.mode=trusted-proxy, internal subsystems (browser tool, sub-agents,
CLI) connect via loopback without proxy headers, causing
trusted_proxy_loopback_source and trusted_proxy_user_missing rejections.

Add trustedProxy.allowLoopback to bypass the hard loopback rejection,
trustedProxy.loopbackUser to auto-inject an operator identity for headerless
loopback connections, and skip requiredHeaders checks for internal loopback
connections using loopbackUser. Real proxy headers are always preferred.

Closes openclaw#43300, closes openclaw#26007
Refs openclaw#4944, openclaw#16299, openclaw#50628
@clawsweeper clawsweeper Bot mentioned this pull request Apr 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Feature request: trustedProxy.loopbackUser for CLI/sub-agent access without proxy

3 participants