Config: plumb opt-in SSRF policy for web fetch, citation redirects, and remote media#55485
Config: plumb opt-in SSRF policy for web fetch, citation redirects, and remote media#55485code-sayuri wants to merge 1 commit intoopenclaw:mainfrom
Conversation
There was a problem hiding this comment.
💡 Codex Review
https://github.com/openclaw/openclaw/blob/d31af642bbb0da6c9a487ad9f8b5066085af7ee9/src/infra/outbound/message-action-params.ts#L144-L148
Propagate remote-media SSRF policy in sandbox mode
hydrateAttachmentPayload now passes messages.remoteMedia.ssrfPolicy, but buildAttachmentMediaLoadOptions drops that policy when policy.mode === "sandbox", so sandboxed actions fetching http(s) media URLs still run with default SSRF checks. Since sandbox normalization explicitly allows remote URLs (resolveSandboxedMediaSource returns them unchanged), operators who set messages.remoteMedia.ssrfPolicy (for example allowRfc2544BenchmarkRange) get inconsistent behavior between host and sandbox paths and can still hit avoidable SSRF blocks.
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| /** If true, permit RFC 2544 benchmarking addresses that are otherwise blocked as special-use IPs. */ | ||
| allowRfc2544BenchmarkRange?: boolean; |
There was a problem hiding this comment.
Either honor or reject browser allowRfc2544BenchmarkRange
This commit exposes browser.ssrfPolicy.allowRfc2544BenchmarkRange in config/types/docs, but the browser runtime resolver (extensions/browser/src/browser/config.ts, resolveBrowserSsrFPolicy) never reads or forwards that field, so accepted config is silently ignored. That creates a misleading no-op knob for browser networking: users can set the value successfully but browser SSRF behavior does not change.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Fixed in e3329c8. resolveBrowserSsrFPolicy() now forwards allowRfc2544BenchmarkRange, and browser config tests cover both the normal configured case and strict mode with benchmark-range allowance preserved.
Greptile SummaryThis PR converts a previous dist-level SSRF hotfix into a source-level, config-driven, opt-in implementation, threading a shared Key changes:
No broad hardcoded allowlists, no domain exceptions, and no unrelated scope creep — the PR stays exactly within its stated goals. Confidence Score: 5/5Safe to merge; all changes are strictly opt-in and default behavior is unchanged. All three SSRF policy paths are correctly implemented and tested. The spread order in No files require special attention.
|
| Filename | Overview |
|---|---|
| src/agents/tools/web-guarded-fetch.ts | Exposes policy on WebToolEndpointFetchOptions; withTrustedWebToolsEndpoint merges caller policy under the hardcoded trusted policy (safe spread order), withStrictWebToolsEndpoint passes policy through unchanged. |
| src/agents/tools/web-fetch.ts | Adds resolveFetchSsrfPolicy helper and threads ssrfPolicy from config into fetchWithWebToolsNetworkGuard (strict mode). Pattern is consistent with existing resolver helpers in the file. |
| src/agents/tools/web-search-citation-redirect.ts | Adds optional ssrfPolicy param to resolveCitationRedirectUrl and forwards it as policy to withStrictWebToolsEndpoint. Clean, minimal change. |
| src/config/legacy-web-search.ts | Adds "citationRedirect" to GENERIC_WEB_SEARCH_KEYS so the nested object is preserved during normalization instead of being silently dropped. |
| src/config/zod-schema.core.ts | Introduces shared SsrFPolicySchema (strict + optional) reused across browser, fetch, search, and messages schemas, replacing an inline definition and adding allowRfc2544BenchmarkRange. |
| src/config/types.browser.ts | Renames BrowserSsrFPolicyConfig to SsrFPolicyConfig (keeps alias), adds allowRfc2544BenchmarkRange so the config type matches the runtime SsrFPolicy type exactly. |
| src/infra/outbound/message-action-params.ts | Plumbs cfg.messages.remoteMedia.ssrfPolicy into buildAttachmentMediaLoadOptions; correctly scoped to the non-sandbox branch (sandbox path makes no outbound network requests). |
| extensions/google/src/gemini-web-search-provider.ts | Threads citationRedirectSsrFPolicy from searchConfig.citationRedirect.ssrfPolicy into runGeminiSearch and forwards it to resolveCitationRedirectUrl. |
Comments Outside Diff (1)
-
src/agents/tools/web-fetch.ts, line 163-169 (link)Runtime cast from
SsrFPolicyConfigtoSsrFPolicySsrFPolicyConfig(config layer,types.browser.ts) andSsrFPolicy(infra layer,ssrf.ts) are currently structurally identical, so theas SsrFPolicycast is safe. However, the two types now have to be kept in sync manually — a field added to one must be added to the other, or the cast will silently drop it at runtime.This is a known trade-off of the config/infra layer separation, but worth calling out. If the circular-dependency constraint allows it, importing
SsrFPolicydirectly intotypes.browser.ts(or re-exporting from a shared barrel) would eliminate the need for this cast entirely.Prompt To Fix With AI
This is a comment left during a code review. Path: src/agents/tools/web-fetch.ts Line: 163-169 Comment: **Runtime cast from `SsrFPolicyConfig` to `SsrFPolicy`** `SsrFPolicyConfig` (config layer, `types.browser.ts`) and `SsrFPolicy` (infra layer, `ssrf.ts`) are currently structurally identical, so the `as SsrFPolicy` cast is safe. However, the two types now have to be kept in sync manually — a field added to one must be added to the other, or the cast will silently drop it at runtime. This is a known trade-off of the config/infra layer separation, but worth calling out. If the circular-dependency constraint allows it, importing `SsrFPolicy` directly into `types.browser.ts` (or re-exporting from a shared barrel) would eliminate the need for this cast entirely. How can I resolve this? If you propose a fix, please make it concise.
Prompt To Fix All With AI
This is a comment left during a code review.
Path: src/agents/tools/web-guarded-fetch.ts
Line: 71-74
Comment:
**`withTrustedWebToolsEndpoint` silently discards caller-supplied restrictions**
The merge spreads `params.policy` first and then `WEB_TOOLS_TRUSTED_NETWORK_SSRF_POLICY` on top, so any property explicitly set in the hardcoded policy (`dangerouslyAllowPrivateNetwork: true`, `allowRfc2544BenchmarkRange: true`) will **always win**, even if a caller passes `{ dangerouslyAllowPrivateNetwork: false }`.
For the trusted endpoint this is intentional — callers cannot accidentally weaken it — but the current type signature (`WebToolEndpointFetchOptions` now allows `policy`) gives no indication of this precedence rule. Consider adding a JSDoc note on the function to make the "hardcoded policy always overrides" contract explicit, so future callers understand that `policy` is additive (hostname allowlists survive) but cannot override the trust knobs.
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: src/agents/tools/web-fetch.ts
Line: 163-169
Comment:
**Runtime cast from `SsrFPolicyConfig` to `SsrFPolicy`**
`SsrFPolicyConfig` (config layer, `types.browser.ts`) and `SsrFPolicy` (infra layer, `ssrf.ts`) are currently structurally identical, so the `as SsrFPolicy` cast is safe. However, the two types now have to be kept in sync manually — a field added to one must be added to the other, or the cast will silently drop it at runtime.
This is a known trade-off of the config/infra layer separation, but worth calling out. If the circular-dependency constraint allows it, importing `SsrFPolicy` directly into `types.browser.ts` (or re-exporting from a shared barrel) would eliminate the need for this cast entirely.
How can I resolve this? If you propose a fix, please make it concise.Reviews (1): Last reviewed commit: "Config: plumb opt-in SSRF policy" | Re-trigger Greptile
| policy: { | ||
| ...(params.policy ?? {}), | ||
| ...WEB_TOOLS_TRUSTED_NETWORK_SSRF_POLICY, | ||
| }, |
There was a problem hiding this comment.
withTrustedWebToolsEndpoint silently discards caller-supplied restrictions
The merge spreads params.policy first and then WEB_TOOLS_TRUSTED_NETWORK_SSRF_POLICY on top, so any property explicitly set in the hardcoded policy (dangerouslyAllowPrivateNetwork: true, allowRfc2544BenchmarkRange: true) will always win, even if a caller passes { dangerouslyAllowPrivateNetwork: false }.
For the trusted endpoint this is intentional — callers cannot accidentally weaken it — but the current type signature (WebToolEndpointFetchOptions now allows policy) gives no indication of this precedence rule. Consider adding a JSDoc note on the function to make the "hardcoded policy always overrides" contract explicit, so future callers understand that policy is additive (hostname allowlists survive) but cannot override the trust knobs.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/tools/web-guarded-fetch.ts
Line: 71-74
Comment:
**`withTrustedWebToolsEndpoint` silently discards caller-supplied restrictions**
The merge spreads `params.policy` first and then `WEB_TOOLS_TRUSTED_NETWORK_SSRF_POLICY` on top, so any property explicitly set in the hardcoded policy (`dangerouslyAllowPrivateNetwork: true`, `allowRfc2544BenchmarkRange: true`) will **always win**, even if a caller passes `{ dangerouslyAllowPrivateNetwork: false }`.
For the trusted endpoint this is intentional — callers cannot accidentally weaken it — but the current type signature (`WebToolEndpointFetchOptions` now allows `policy`) gives no indication of this precedence rule. Consider adding a JSDoc note on the function to make the "hardcoded policy always overrides" contract explicit, so future callers understand that `policy` is additive (hostname allowlists survive) but cannot override the trust knobs.
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
Fixed in e3329c8. The sandbox branch now preserves messages.remoteMedia.ssrfPolicy, so remote http(s) media keeps the configured SSRF policy in both host and sandbox paths.
4f263f4 to
e3329c8
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e3329c854f
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| @@ -539,6 +549,7 @@ async function runWebFetch(params: WebFetchRuntimeParams): Promise<Record<string | |||
| url: params.url, | |||
| maxRedirects: params.maxRedirects, | |||
| timeoutSeconds: params.timeoutSeconds, | |||
| policy: params.ssrfPolicy, | |||
There was a problem hiding this comment.
Include SSRF policy in web_fetch cache key
runWebFetch now applies params.ssrfPolicy to guarded fetches, but the cache identity is still only url/extractMode/maxChars, so cache hits occur before SSRF checks. In a long-lived process where one tool instance uses a permissive tools.web.fetch.ssrfPolicy (or config is reloaded) and another uses strict settings, strict calls can receive permissively fetched cached content for the TTL window. Please include a stable SSRF-policy fingerprint in the cache key (or skip cache when policies differ).
Useful? React with 👍 / 👎.
| @@ -221,6 +225,7 @@ function createGeminiToolDefinition( | |||
| apiKey, | |||
| model, | |||
| timeoutSeconds: resolveSearchTimeoutSeconds(searchConfig), | |||
| citationRedirectSsrFPolicy: searchConfig?.citationRedirect?.ssrfPolicy, | |||
There was a problem hiding this comment.
Add citation redirect policy to Gemini cache identity
This change introduces citationRedirectSsrFPolicy for redirect resolution, but the Gemini cache key still omits that policy and only keys on provider/query/count/model. If a permissive tools.web.search.citationRedirect.ssrfPolicy run populates cache, later stricter runs can reuse those cached citations without reapplying strict redirect checks. Add a normalized citation-redirect SSRF policy component to the cache key.
Useful? React with 👍 / 👎.
e3329c8 to
d84a762
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d84a762a57
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| .strict() | ||
| .optional(), | ||
| suppressToolErrors: z.boolean().optional(), | ||
| remoteMedia: z | ||
| .object({ |
There was a problem hiding this comment.
Wire remoteMedia SSRF policy through all send paths
This commit introduces messages.remoteMedia.ssrfPolicy as a global outbound-media setting, but only a subset of senders actually forward it; other production paths still fetch media without that policy (for example extensions/slack/src/send.ts and extensions/telegram/src/send.ts call loadWebMedia without an SSRF override). In deployments that set messages.remoteMedia.ssrfPolicy to allow RFC2544/private ranges for legitimate hosts, those channels will still fail because the config is silently ignored, creating channel-dependent behavior that is hard to diagnose.
Useful? React with 👍 / 👎.
b379e1d to
9233351
Compare
|
Codex review: found issues before merge. Summary Reproducibility: yes. The reproduction path is code-based: permissive SSRF policy can populate web_fetch or Gemini search caches, then stricter calls can read policy-agnostic cached data before the guard reruns; remoteMedia inconsistency is visible from send paths that omit the policy option. Next step before merge Security Review findings
Review detailsBest possible solution: Rebase onto current main, keep the cache-safe web_fetch implementation, and land the citationRedirect and remoteMedia policy surfaces only with policy-aware cache identity plus complete sender coverage. Do we have a high-confidence way to reproduce the issue? Yes. The reproduction path is code-based: permissive SSRF policy can populate web_fetch or Gemini search caches, then stricter calls can read policy-agnostic cached data before the guard reruns; remoteMedia inconsistency is visible from send paths that omit the policy option. Is this the best way to solve the issue? No. The direction is reasonable, but this PR is not the best current form until it preserves current main's cache isolation, adds citation redirect policy to Gemini cache identity, and completes remote-media routing. Full review comments:
Overall correctness: patch is incorrect Security concerns:
What I checked:
Likely related people:
Remaining risk / open question:
Codex review notes: model gpt-5.5, reasoning high; reviewed against a4f2bf273a21. |
Summary
This PR converts the previously proven dist-level SSRF benchmark-range hotfix into a source-level, config-driven, opt-in implementation.
It adds shared
ssrfPolicyconfig plumbing for three scoped paths:tools.web.fetch.ssrfPolicytools.web.search.citationRedirect.ssrfPolicymessages.remoteMedia.ssrfPolicyIt also fixes legacy web search normalization so
tools.web.search.citationRedirectis preserved instead of being dropped during normalization.Why
Some public endpoints can resolve into RFC 2544 benchmarking / special-use address space and were being blocked as false positives by SSRF handling.
The goal here is not to broaden defaults or add domain-specific exceptions, but to expose existing shared SSRF policy controls in the right runtime/config surfaces so operators can opt in per subsystem.
What changed
tools.web.fetch.ssrfPolicytools.web.search.citationRedirect.ssrfPolicymessages.remoteMedia.ssrfPolicyweb_fetchguarded fetch pathcitationRedirectduring legacy web-search normalizationNon-goals
Validation
Scoped validation performed:
src/agents/tools/web-guarded-fetch.test.tssrc/agents/tools/web-search.redirect.test.tssrc/config/legacy-web-search.test.tssrc/media/load-options.test.tssrc/plugin-sdk/outbound-media.test.tspnpm config:schema:check