fix(media-understanding): auto-upgrade provider HTTP helper to trusted env proxy mode#64974
fix(media-understanding): auto-upgrade provider HTTP helper to trusted env proxy mode#64974mjamiv wants to merge 5 commits intoopenclaw:mainfrom
Conversation
…d env proxy mode The provider HTTP helper `fetchWithTimeoutGuarded` (and its downstream wrappers `postJsonRequest` and `postTranscriptionRequest`) is used by every image, music, video, and audio generation provider to call their remote APIs. It forwards no `mode` to `fetchWithSsrFGuard`, so the guard falls through to STRICT and forces a Node-level `dns.lookup()` pre- resolution of the target hostname before dialing the configured proxy. In HTTP-proxy-only environments (containers, restricted sandboxes, corporate networks with DNS-over-proxy, Clash TUN fake-IP) that local DNS lookup fails with EAI_AGAIN / ENOTFOUND before the request ever reaches the proxy. LLM chat completions for the same providers work fine because they inherit undici's global dispatcher (typically EnvHttpProxyAgent set by a bootstrap preload), so only media / provider HTTP paths break. Image generation reports EAI_AGAIN api.minimax.io while chat to the same host streams normally; audio transcription reports ENOTFOUND api.openai.com under the same conditions; music / video generation fail with the same DNS error class. Fix: auto-upgrade fetchWithTimeoutGuarded to TRUSTED_ENV_PROXY mode when hasProxyEnvConfigured() is true and the caller did not pass an explicit mode. Callers that need strict pinned-DNS can still opt in by passing mode: "strict" or by calling fetchWithSsrFGuard directly. Security rationale: when an HTTP CONNECT proxy is in the egress path, the proxy itself performs hostname resolution and can rewrite traffic to any destination, so client-side DNS pinning does not meaningfully constrain the target IP anyway. Moving provider HTTP calls to trusted env proxy mode in those environments restores functionality without weakening the protection DNS pinning actually provides. SSRF posture in direct-egress setups (no HTTP_PROXY env) is unchanged. This also fixes the memory embeddings failure described in openclaw#52162, which shares the same code path. Scope: only fetchWithTimeoutGuarded in src/media-understanding/shared.ts, which is used exclusively by provider HTTP helpers. Lower-level callers of fetchWithSsrFGuard are unchanged. No default behavior change when no HTTP proxy is configured in the environment. Tests: four new vitest cases cover (1) no-op path when no proxy env, (2) auto-upgrade on postJsonRequest, (3) caller mode: "strict" override respected, (4) transcription-request path auto-upgrades. All 15 tests in shared.test.ts and all 39 tests in fetch-guard.ssrf.test.ts pass (54/54 combined). Refs: openclaw#52162
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 198fe89182
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| const resolvedMode = | ||
| options?.mode ?? (hasProxyEnvConfigured() ? GUARDED_FETCH_MODE.TRUSTED_ENV_PROXY : undefined); |
There was a problem hiding this comment.
Preserve dispatcherPolicy when auto-selecting trusted proxy mode
When hasProxyEnvConfigured() is true, this line unconditionally sets mode: "trusted_env_proxy" unless the caller passes an explicit mode. In the postJsonRequest/postTranscriptionRequest path, callers can supply dispatcherPolicy (from resolveProviderHttpRequestConfig), but fetchWithSsrFGuard prioritizes trusted-env mode and builds an EnvHttpProxyAgent, which bypasses explicit-proxy/direct policy details (for example custom proxy URL, proxyTls, or target connect options). That means provider-specific proxy/TLS transport settings are silently ignored whenever proxy env vars are present, which can break enterprise proxy/mTLS configurations.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Addressed in 8c2c861. The decision now lives in a documented shouldAutoUpgradeToTrustedEnvProxy helper: when a caller passes an explicit dispatcherPolicy, the helper returns false and we do not override with TRUSTED_ENV_PROXY mode. Enterprise proxy/mTLS configs (custom proxy URL, proxyTls, connect options) now flow through unchanged. Added a test case ("does not auto-upgrade when caller passes explicit dispatcherPolicy") that asserts both mode is unset and the original dispatcherPolicy is forwarded intact.
Greptile SummaryThis PR fixes Confidence Score: 5/5Safe to merge — minimal, well-gated change with no behavior difference in direct-egress environments and comprehensive test coverage. All findings are P2 or lower. The core logic is correct: the No files require special attention. Reviews (2): Last reviewed commit: "fix(media-understanding): thread mode ov..." | Re-trigger Greptile |
…t / postTranscriptionRequest Greptile review on openclaw#64974 flagged that while fetchWithTimeoutGuarded now accepts an optional `mode` to opt back into STRICT pinned-DNS inside a proxy environment, that escape hatch is not surfaced on the convenience wrappers that 99% of provider code actually calls. A future caller that genuinely needs strict pinned-DNS would have to bypass postJsonRequest / postTranscriptionRequest entirely and drop down to fetchWithSsrFGuard, which is user-hostile and defeats the point of the helpers. Fix: add `mode?: GuardedFetchMode` to both wrapper signatures and thread it through `resolveGuardedPostRequestOptions` (which is already typed to derive from `fetchWithTimeoutGuarded`'s options via `GuardedPostRequestOptions`, so the option already exists on the target type — this commit just surfaces it). Also updates the short-circuit in `resolveGuardedPostRequestOptions` so that an explicit `mode` is no longer silently dropped when every other option is at its default. Two new vitest cases cover the override path for both wrappers. All 17 tests in `shared.test.ts` pass (13 existing + 2 auto-upgrade paths from the original commit + 2 new override paths in this commit). Refs: openclaw#52162, review feedback on openclaw#64974
|
@greptile-apps fair point — addressed in 63fd142. Added Two new vitest cases cover the override path through both wrappers. 17/17 |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 63fd14233a
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| // See openclaw#52162 for the reported failure mode on memory embeddings, | ||
| // which shares this code path with image/music/video/audio generation. | ||
| const resolvedMode = | ||
| options?.mode ?? (hasProxyEnvConfigured() ? GUARDED_FETCH_MODE.TRUSTED_ENV_PROXY : undefined); |
There was a problem hiding this comment.
Gate trusted proxy auto-upgrade on HTTP proxy vars only
resolvedMode now switches to "trusted_env_proxy" whenever hasProxyEnvConfigured() is true, but that helper treats ALL_PROXY/all_proxy as configured. In this repo, EnvHttpProxyAgent-style resolution explicitly ignores ALL_PROXY, so a process with only ALL_PROXY set will still be auto-upgraded here, skip strict SSRF hostname checks, and then make a direct request instead of using a real HTTP proxy. This is a behavioral and security regression introduced by the new auto-upgrade path; the mode switch should be gated by HTTP(S)-proxy detection (e.g. hasEnvHttpProxyConfigured) rather than any proxy env key.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Addressed in 8c2c861. Swapped from hasProxyEnvConfigured() to hasEnvHttpProxyConfigured(protocol), which only checks HTTP_PROXY/HTTPS_PROXY (and lowercase variants) — matching undici EnvHttpProxyAgent's own resolution that ignores ALL_PROXY. The protocol is now derived from the target URL at the decision site so https and http callers key on the right env vars. Added a test case ("does not auto-upgrade when only ALL_PROXY is configured") asserting no mode is set when the HTTP(S) gate returns false. ALL_PROXY-only environments now fall through to strict SSRF mode instead of being silently upgraded past the hostname-check path.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d94ebf51be
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| // See openclaw#52162 for the reported failure mode on memory embeddings, | ||
| // which shares this code path with image/music/video/audio generation. | ||
| const resolvedMode = | ||
| options?.mode ?? (hasProxyEnvConfigured() ? GUARDED_FETCH_MODE.TRUSTED_ENV_PROXY : undefined); |
There was a problem hiding this comment.
Avoid trusted-proxy mode when NO_PROXY can bypass egress
The new resolvedMode default unconditionally switches provider requests into trusted_env_proxy whenever any proxy env var exists, but fetchWithSsrFGuard skips resolvePinnedHostnameWithPolicy in that mode (src/infra/net/fetch-guard.ts), and EnvHttpProxyAgent can still make direct connections for NO_PROXY matches (noted in src/infra/net/proxy-fetch.ts). In environments with HTTP_PROXY plus NO_PROXY entries (including common localhost exclusions or NO_PROXY=*), these requests are auto-upgraded here and can bypass SSRF/private-network checks that strict mode previously enforced.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Addressed in 8c2c861. Added matchesNoProxy() to src/infra/net/proxy-env.ts that mirrors undici EnvHttpProxyAgent semantics: wildcard *, exact hostname, leading-dot / subdomain-suffix, optional :port, IPv6 in bracket form, case-insensitive, comma-separated list. Undici does not export its own matcher, so this is a self-contained reimplementation kept in sync with undici/lib/dispatcher/env-http-proxy-agent.js. The shouldAutoUpgradeToTrustedEnvProxy helper calls it with the target URL and returns false on a match — so HTTP_PROXY + NO_PROXY=localhost (or NO_PROXY=*) environments now keep strict SSRF + pinned-DNS for the exempt targets instead of bypassing them. 15 new matchesNoProxy test cases in proxy-env.test.ts plus a new shared.test.ts case ("does not auto-upgrade when target URL matches NO_PROXY").
…rPolicy, HTTP(S) proxy, NO_PROXY Addresses 3 P1 review comments on openclaw#64974 about SSRF-bypass edge cases in the initial auto-upgrade path. 1. Respect caller's explicit dispatcherPolicy. When a caller passes one (custom proxy URL, proxyTls, connect options), auto-upgrade would build an EnvHttpProxyAgent that silently drops those overrides and break enterprise proxy/mTLS configs. Now skipped when dispatcherPolicy is set. 2. Gate on HTTP(S) proxy detection only. The original gate used hasProxyEnvConfigured() which counts ALL_PROXY, but EnvHttpProxyAgent explicitly ignores ALL_PROXY. Auto-upgrading on ALL_PROXY-only envs skipped SSRF hostname checks then made direct requests. Swapped to hasEnvHttpProxyConfigured() keyed on the target URL protocol. 3. Skip auto-upgrade when target matches NO_PROXY. EnvHttpProxyAgent makes direct connections for NO_PROXY matches, but trusted-proxy mode skips resolvePinnedHostnameWithPolicy — so those direct calls bypassed SSRF protection. Added matchesNoProxy() in proxy-env.ts mirroring undici EnvHttpProxyAgent semantics (wildcard, exact, leading-dot, suffix, port, IPv6 literal, case-insensitive). Undici does not export its matcher so this is a self-contained reimplementation. Decision lives in a documented helper shouldAutoUpgradeToTrustedEnvProxy so the gating rationale stays next to the call site. Test count 19 -> 44 (15 new matchesNoProxy cases in proxy-env.test.ts and 3 new gating cases in shared.test.ts for ALL_PROXY-only, explicit dispatcherPolicy, and NO_PROXY match). All 44 pass locally. Refs openclaw#52162, openclaw#64974 review threads 3068434040 / 3068446328 / 3068982854.
d94ebf5 to
8c2c861
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 8c2c86155c
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| ? "80" | ||
| : ""; | ||
|
|
||
| for (const rawEntry of raw.split(",")) { |
There was a problem hiding this comment.
Split NO_PROXY entries on whitespace and commas
Fresh evidence: the new matchesNoProxy gate now uses raw.split(","), but EnvHttpProxyAgent tokenizes NO_PROXY with both commas and whitespace. With values like NO_PROXY="localhost *.corp.internal", this parser misses *.corp.internal, shouldAutoUpgradeToTrustedEnvProxy enables trusted_env_proxy, and requests to those hosts can go direct via NO_PROXY while fetchWithSsrFGuard skips pinned-DNS checks in trusted mode.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Addressed in 4756cdc. matchesNoProxy now splits on /[,\s]/ to mirror undici EnvHttpProxyAgent#parseNoProxy exactly — both commas and whitespace (space, tab, newline) are valid delimiters, and empty entries from runs of delimiters are filtered by the existing if (!entry) continue guard. Values like NO_PROXY="localhost *.corp.internal" and NO_PROXY="a, b\tc" now tokenize correctly.
New test coverage in proxy-env.test.ts:
- "parses whitespace-separated list (undici tokenizes on [,\s])"
- "parses mixed comma-and-whitespace list"
- "tab and newline act as delimiters"
| continue; | ||
| } | ||
|
|
||
| const normalizedEntry = entryHost.startsWith(".") ? entryHost.slice(1) : entryHost; |
There was a problem hiding this comment.
Normalize
*. NO_PROXY host patterns before matching
Fresh evidence: the new matcher only removes a leading . (entryHost.startsWith(".")), so valid EnvHttpProxyAgent patterns like *.example.com never match. In environments that rely on wildcard NO_PROXY exclusions, the media helper will still auto-upgrade to trusted_env_proxy; those wildcard-matched hosts may bypass the proxy directly, and trusted mode then bypasses the strict hostname pinning path.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Addressed in 4756cdc. Entry normalization is now entryHost.replace(/^\*?\./, ""), mirroring undici EnvHttpProxyAgent#parseNoProxy line-for-line. Both .example.com and *.example.com normalize to example.com and flow through the existing exact + subdomain-suffix match logic. This matches undici's own quirk that a *.example.com entry also bypasses the bare domain example.com (since the normalization drops the wildcard-plus-dot prefix before comparison) — documented inline in the jsdoc so future maintainers don't accidentally tighten it to subdomain-only and drift away from upstream.
New test coverage in proxy-env.test.ts:
- "matches subdomain via *. wildcard normalization" —
*.example.com→foo.example.com - "wildcard *.example.com matches bare example.com (undici normalizes to base domain)"
- "*. wildcard respects port" —
*.corp.internal:8443 - "*. wildcard does not match unrelated suffix" — negative case against
*.example.comvsapi.example.org
Two follow-ups on the trusted-env-proxy auto-upgrade gate:
1. Split NO_PROXY on whitespace and commas, not commas alone. Undici's
EnvHttpProxyAgent tokenizes on `/[,\s]/`, so values like
`NO_PROXY="localhost *.corp.internal"` previously missed the second
entry and enabled trusted_env_proxy when the request was actually
destined for a bypass host.
2. Normalize leading `*.` wildcard patterns. `entryHost.replace(/^\*?\./, "")`
matches undici's own normalization so `*.example.com` correctly gates
both `foo.example.com` and (per undici's own quirk) `example.com`
itself; the previous `startsWith(".")` strip ignored `*.` entirely.
Both cases could let requests auto-upgrade into trusted_env_proxy and
skip the strict SSRF hostname pin when NO_PROXY would in fact route
them direct.
Added 8 coverage cases to `proxy-env.test.ts`:
- whitespace-separated, mixed comma+whitespace, tab+newline delimiters
- `*.example.com` subdomain match, bare-domain match, port match,
unrelated-suffix negative
Mirrors upstream undici@main
(lib/dispatcher/env-http-proxy-agent.js `#parseNoProxy` + `#shouldProxy`).
Refs: openclaw#64974 (review comments 3072896840 + 3072896847)
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 4756cdcc47
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| (shouldAutoUpgradeToTrustedEnvProxy({ | ||
| url, | ||
| dispatcherPolicy: options?.dispatcherPolicy, | ||
| }) | ||
| ? GUARDED_FETCH_MODE.TRUSTED_ENV_PROXY |
There was a problem hiding this comment.
Re-evaluate NO_PROXY for redirect targets
Auto-upgrading to trusted_env_proxy is decided once from the initial URL, but fetchWithSsrFGuard applies that mode across the entire redirect chain. In environments with HTTP(S)_PROXY plus NO_PROXY entries (for example localhost/internal hosts), an initial public URL can be upgraded here and then redirect to a NO_PROXY host; EnvHttpProxyAgent will connect directly for that hop while trusted mode skips resolvePinnedHostnameWithPolicy, so SSRF/private-network checks are bypassed on the redirected target. This is a regression from strict default behavior and should be gated per-hop (or disable trusted mode when a redirect target matches NO_PROXY).
Useful? React with 👍 / 👎.
Summary
Auto-upgrades the provider HTTP helper
fetchWithTimeoutGuardedinsrc/media-understanding/shared.tstoGUARDED_FETCH_MODE.TRUSTED_ENV_PROXYwhenhasProxyEnvConfigured()returns true and the caller did not pass an explicit mode. Also adds an optionalmodeparameter so callers that need strict pinned-DNS can still opt back in.Refs #52162.
Why
Image, music, video, and audio generation providers all route through
postJsonRequest/postTranscriptionRequestinshared.ts, which callfetchWithTimeoutGuarded, which callsfetchWithSsrFGuardwithout amode. That falls through toSTRICT, which forces a Node-leveldns.lookup()pre-resolution of the target hostname before dialing the configured proxy.In HTTP-proxy-only environments — containers, restricted sandboxes, corporate networks with DNS-over-proxy, Clash TUN fake-IP — that local DNS lookup fails with
EAI_AGAINorENOTFOUNDbefore the request ever reaches the proxy. The observed symptoms:EAI_AGAIN api.minimax.io/ENOTFOUND api.openai.comfromimage-generation-provider-*.jsENOTFOUND api.openai.comfrom thepostTranscriptionRequestpathENOTFOUND api.ohmygpt.comviawithRemoteHttpResponse/ Gemini embedding pathLLM chat completions to the same hosts work fine in these environments because they inherit
undici's global dispatcher (typically set toEnvHttpProxyAgentby a bootstrap preload). Only the media / provider HTTP paths are affected.Change scope
fetchWithTimeoutGuardedinsrc/media-understanding/shared.tsis touched.fetchWithSsrFGuardandsrc/infra/net/fetch-guard.tsare unchanged — lower-level callers keep their existing STRICT default.hasProxyEnvConfigured()).mode: "strict"or by callingfetchWithSsrFGuarddirectly.Security rationale
When an HTTP CONNECT proxy sits in the egress path, the proxy itself performs hostname resolution and can rewrite traffic to any destination, so client-side DNS pinning does not meaningfully constrain the target IP anyway. Moving provider HTTP calls to
TRUSTED_ENV_PROXYmode in those environments restores functionality without weakening the protection DNS pinning actually provides — SSRF posture in direct-egress setups (noHTTP_PROXYenv) is unchanged becausehasProxyEnvConfigured()returns false and the code path is identical to today's.Tests
Four new vitest cases in
src/media-understanding/shared.test.tsmock bothfetchWithSsrFGuardandhasProxyEnvConfigured:modeis not set on the guard call whenhasProxyEnvConfigured()returnsfalsepostJsonRequest— assertsmode: "trusted_env_proxy"is set whenhasProxyEnvConfigured()returnstruemode: "strict"override respected — asserts the caller-supplied mode wins over the auto-upgradepostTranscriptionRequestalso gets the upgrade (verifying the shared helper covers all call sites, not just JSON)Test plan
pnpm exec vitest run src/media-understanding/shared.test.ts— 15/15 pass (11 existing + 4 new)pnpm exec vitest run src/infra/net/fetch-guard.ssrf.test.ts— 39/39 pass (no adjacent regression)pnpm checkhad unrelated type errors insrc/gateway/server-channels.test.tsandsrc/plugin-sdk/channel-runtime-context.tsalready present on currentmainhead at462d8e3b fix(cycles): narrow channel runtime surface)2026.4.11-beta.1,minimax/image-01generates successfully throughapi.minimax.io/v1/image_generationover an HTTP CONNECT proxy, returning a real image. Without the patch, the same call fails withEAI_AGAINbefore hitting the proxy.Happy to rebase / split / narrow further if the maintainers would prefer a different shape. The core one-line behavior change is the
resolvedModeternary; the rest is documentation and tests.