Skip to content

fix(net): skip DNS pinning before trusted env proxy dispatch#59007

Merged
steipete merged 1 commit intoopenclaw:mainfrom
cluster2600:fix/trusted-env-proxy-dns-order
Apr 8, 2026
Merged

fix(net): skip DNS pinning before trusted env proxy dispatch#59007
steipete merged 1 commit intoopenclaw:mainfrom
cluster2600:fix/trusted-env-proxy-dns-order

Conversation

@cluster2600
Copy link
Copy Markdown
Contributor

Summary

Fixes fetchWithSsrFGuard() so trusted env-proxy mode can actually reach EnvHttpProxyAgent before any local DNS lookup is attempted.

This fixes the remaining proxy-only sandbox failure tracked in #59005. It is the same production gap described in #58034, but rebased cleanly on current main and with the fallback test tightened so host ALL_PROXY / all_proxy values cannot accidentally satisfy the proxy branch.

What changed

  • move DNS pinning and pinned-dispatcher creation into the non-proxy branch
  • keep trusted env-proxy dispatch as an early branch
  • clear all six proxy env keys in the tests before setting expectations
  • add an explicit regression test for trusted mode with no proxy env vars present

Why

#50650 fixed the strict-mode env-proxy path, but both main and v2026.3.31 still resolve DNS before entering the trusted env-proxy branch:

const pinned = await resolvePinnedHostnameWithPolicy(...);
const canUseTrustedEnvProxy =
  mode === GUARDED_FETCH_MODE.TRUSTED_ENV_PROXY && hasProxyEnvConfigured();
if (canUseTrustedEnvProxy) {
  dispatcher = new EnvHttpProxyAgent();
}

In proxy-only sandboxes, that ordering means the request fails on local DNS and never reaches the trusted proxy.

Testing

  • pnpm exec vitest run --config vitest.unit.config.ts src/infra/net/fetch-guard.ssrf.test.ts
  • pnpm check

Notes

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 1, 2026

Greptile Summary

This PR fixes a race condition in fetchWithSsrFGuard() where DNS resolution was attempted before the trusted env-proxy dispatch branch was evaluated, causing failures in proxy-only sandboxes. The fix is minimal and correct: DNS pinning (resolvePinnedHostnameWithPolicy) and assertExplicitProxySupportsPinnedDns are moved inside the else branch so they are only invoked when the EnvHttpProxyAgent path is not taken. The test improvements are solid — clearing all six proxy env keys before each proxy-mode test and adding an explicit regression test for the fallback-to-DNS-pinning path when no proxy env var is present.

  • Core fix: DNS lookup and dispatcher creation are now gated behind the else branch, so TRUSTED_ENV_PROXY mode with a configured proxy bypasses local DNS entirely and routes immediately to EnvHttpProxyAgent
  • assertExplicitProxySupportsPinnedDns correctly moved: The assertion about explicit-proxy + HTTP only makes sense in the DNS-pinned path, so placing it in the else branch is semantically appropriate
  • Test hardening: clearProxyEnv() sets all six proxy env variables (HTTP_PROXY, HTTPS_PROXY, ALL_PROXY, and lowercase variants) to empty strings before setting expectations, preventing ALL_PROXY/all_proxy from satisfying the proxy branch in the strict-mode test
  • New regression test: Covers the TRUSTED_ENV_PROXY + no proxy env configured path, verifying the DNS-pinned dispatcher is used and lookupFn is called exactly once

Confidence Score: 5/5

Safe to merge — the fix is minimal, targeted, and backed by regression tests covering both the proxy and fallback paths.

No P0 or P1 issues found. The logic change is straightforward and matches the described intent: moving DNS resolution into the non-proxy branch avoids failing on local DNS in proxy-only sandboxes. The existing SSRF guarantees are preserved in STRICT mode and in the TRUSTED_ENV_PROXY fallback path. The test additions adequately cover both the new behavior and the regression case.

No files require special attention.

Reviews (1): Last reviewed commit: "fix(net): skip DNS pinning before truste..." | Re-trigger Greptile

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 03fd2bc1b2

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines 160 to 162
const canUseTrustedEnvProxy =
mode === GUARDED_FETCH_MODE.TRUSTED_ENV_PROXY && hasProxyEnvConfigured();
if (canUseTrustedEnvProxy) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Gate trusted-proxy fast path on effective proxy vars

This branch now bypasses resolvePinnedHostnameWithPolicy whenever hasProxyEnvConfigured() is true, but that predicate is broader than EnvHttpProxyAgent behavior (for example ALL_PROXY/all_proxy can satisfy it even though the agent ignores them). In that environment, trusted mode skips DNS pinning and hostname/private-network policy checks and still dispatches with EnvHttpProxyAgent without an actually configured HTTP(S) proxy, which is a security regression from the previous flow that enforced SSRF policy before deciding the dispatcher. Use an EnvHttpProxyAgent-matching proxy check (for example hasEnvHttpProxyConfigured("https")) before taking this fast path.

Useful? React with 👍 / 👎.

@cosmicnet
Copy link
Copy Markdown
Contributor

Thanks for looking at this - we're clearly hitting the same root cause.

A few things this PR doesn't cover that #58034 does:

  1. hasProxyEnvConfigured() is too broad - it matches ALL_PROXY/all_proxy, which EnvHttpProxyAgent ignores. This means the proxy path activates (skipping DNS pinning) when no effective proxy exists. Codex flagged this as P1 on this PR too. fix(net): harden trusted env proxy fetch guard and add explicit web_fetch opt-in #58034 uses hasEnvHttpProxyConfigured(protocol) which matches EnvHttpProxyAgent semantics and is protocol-aware.

  2. No hostname SSRF checks in the proxy path - when the proxy branch is taken, resolvePinnedHostnameWithPolicy is skipped entirely, which also skips hostname allowlist enforcement, blocked-hostname checks, and private IP literal blocking. fix(net): harden trusted env proxy fetch guard and add explicit web_fetch opt-in #58034 adds assertHostnameAllowedByPolicy() to enforce the pre-DNS policy checks even when DNS resolution is delegated to the proxy.

  3. No NO_PROXY handling - if HTTPS_PROXY is set but the target host is excluded by NO_PROXY/no_proxy, EnvHttpProxyAgent connects directly (no proxy). This PR still skips DNS pinning in that case, so a direct connection runs without post-resolution private-IP checks. fix(net): harden trusted env proxy fetch guard and add explicit web_fetch opt-in #58034 adds isHostExcludedByNoProxy() to fall back to the DNS-pinned path when the proxy would be bypassed.

  4. No web-fetch.ts change - without useEnvProxy: true in runWebFetch(), the agent's web_fetch tool never opts into TRUSTED_ENV_PROXY mode, so this fix wouldn't activate for the primary use case (agent web fetches in sandboxed environments).

These gaps are security-relevant (Codex flagged #1 as P1 here too), so they'd need addressing before this could merge safely. #58034 already covers all four.

@mjamiv
Copy link
Copy Markdown
Contributor

mjamiv commented Apr 6, 2026

Production confirmation from a proxy-only OpenShell deployment (4 sandboxes, OpenClaw 2026.4.5)

We've been hitting this exact DNS-before-proxy ordering bug since deploying OpenClaw behind an HTTP CONNECT proxy in OpenShell sandboxes where DNS is unavailable (the proxy handles resolution via 10.200.0.1:3128).

Symptoms we see without a fix:

  • All web_search and web_fetch calls fail with getaddrinfo EAI_AGAIN before EnvHttpProxyAgent is ever instantiated
  • LLM provider requests (OpenAI, Anthropic) fail identically — the SSRF guard tries DNS resolution, fails, and never reaches the proxy code path

Our workaround: We maintain a Python patch script (openclaw-fetch-guard-patch.py) that does the same reorder this PR implements — moving the hasProxyEnvConfigured() check before resolvePinnedHostnameWithPolicy(). We re-apply it after every OpenClaw update. The pattern had to be updated for 2026.4.5 since the code structure changed from the earlier version.

Production evidence:

  • 4 agents running 24/7 behind CONNECT proxy with this reorder applied
  • web_search (Brave), web_fetch (Firecrawl), and all LLM provider calls work correctly
  • No SSRF bypass concerns since the proxy itself is the security boundary (sandbox has no direct internet access)

This is the correct upstream fix. Would love to see it merged so we can stop maintaining the patch. Happy to test any iterations.

Related issues from our deployment:

@steipete steipete force-pushed the fix/trusted-env-proxy-dns-order branch from 03fd2bc to 2a915fb Compare April 8, 2026 04:41
@steipete steipete merged commit d7c3210 into openclaw:main Apr 8, 2026
3 checks passed
@steipete
Copy link
Copy Markdown
Contributor

steipete commented Apr 8, 2026

Landed via temp rebase onto main.

  • Gate: pnpm test src/infra/net/fetch-guard.ssrf.test.ts; pnpm check
  • Land commit: 2a915fb22d
  • Merge commit: d7c3210cd6f5fdfdc1beff4c9541673e814354d5

Thanks @cluster2600!

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2a915fb22d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +323 to 327
const pinned = await resolvePinnedHostnameWithPolicy(parsedUrl.hostname, {
lookupFn: params.lookupFn,
policy: params.policy,
});
dispatcher = createPinnedDispatcher(pinned, params.dispatcherPolicy, params.policy);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Keep hostname policy checks when pinDns is disabled

Moving resolvePinnedHostnameWithPolicy into the final else branch causes the pinDns === false path to skip SSRF validation entirely, not just DNS pinning. In that mode, requests no longer run hostname allowlist or private/special-use IP checks before dispatch, so callers that disable pinning can now reach blocked targets (for example localhost/internal hosts) that were rejected before this commit. Preserve the policy-resolution check (or an equivalent validation step) even when pinDns is false.

Useful? React with 👍 / 👎.

@cluster2600 cluster2600 deleted the fix/trusted-env-proxy-dns-order branch April 8, 2026 06:32
graham33 pushed a commit to graham33/nixos-dgx-spark that referenced this pull request Apr 12, 2026
Patches fetchWithWebToolsNetworkGuard in the openclaw container image to
set useEnvProxy when HTTPS_PROXY is in the environment. Without this,
web_fetch does DNS pinning before checking for a proxy and fails with
EAI_AGAIN in OpenShell proxy-only sandboxes.

Upstream fix: openclaw/openclaw#59007 (merged but not yet released as
of 2026.4.10). Remove this patch once we upgrade to a release with it.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: trusted env-proxy web_search and web_fetch still fail with EAI_AGAIN in v2026.3.31

4 participants