Skip to content

Config: plumb opt-in SSRF policy for web fetch, citation redirects, and remote media#55485

Open
code-sayuri wants to merge 1 commit intoopenclaw:mainfrom
code-sayuri:fix/config-driven-ssrf-policy
Open

Config: plumb opt-in SSRF policy for web fetch, citation redirects, and remote media#55485
code-sayuri wants to merge 1 commit intoopenclaw:mainfrom
code-sayuri:fix/config-driven-ssrf-policy

Conversation

@code-sayuri
Copy link
Copy Markdown

Summary

This PR converts the previously proven dist-level SSRF benchmark-range hotfix into a source-level, config-driven, opt-in implementation.

It adds shared ssrfPolicy config plumbing for three scoped paths:

  • tools.web.fetch.ssrfPolicy
  • tools.web.search.citationRedirect.ssrfPolicy
  • messages.remoteMedia.ssrfPolicy

It also fixes legacy web search normalization so tools.web.search.citationRedirect is preserved instead of being dropped during normalization.

Why

Some public endpoints can resolve into RFC 2544 benchmarking / special-use address space and were being blocked as false positives by SSRF handling.

The goal here is not to broaden defaults or add domain-specific exceptions, but to expose existing shared SSRF policy controls in the right runtime/config surfaces so operators can opt in per subsystem.

What changed

  • Added shared SSRF policy schema/type reuse for:
    • tools.web.fetch.ssrfPolicy
    • tools.web.search.citationRedirect.ssrfPolicy
    • messages.remoteMedia.ssrfPolicy
  • Threaded fetch SSRF policy into the web_fetch guarded fetch path
  • Threaded citation redirect SSRF policy into Gemini citation redirect resolution
  • Threaded remote media SSRF policy into outbound remote media loading paths
  • Preserved citationRedirect during legacy web-search normalization
  • Added focused tests for:
    • guarded fetch SSRF policy forwarding
    • citation redirect SSRF policy forwarding
    • legacy normalization retention
    • outbound media option passthrough

Non-goals

  • No broad hardcoded allowlists
  • No domain-specific exceptions
  • No Discord reconnect/lifecycle changes
  • No unrelated refactors

Validation

Scoped validation performed:

  • src/agents/tools/web-guarded-fetch.test.ts
  • src/agents/tools/web-search.redirect.test.ts
  • src/config/legacy-web-search.test.ts
  • src/media/load-options.test.ts
  • src/plugin-sdk/outbound-media.test.ts
  • pnpm config:schema:check

@openclaw-barnacle openclaw-barnacle Bot added docs Improvements or additions to documentation channel: mattermost Channel integration: mattermost channel: msteams Channel integration: msteams channel: signal Channel integration: signal channel: zalouser Channel integration: zalouser agents Agent runtime and tooling size: L labels Mar 27, 2026
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

https://github.com/openclaw/openclaw/blob/d31af642bbb0da6c9a487ad9f8b5066085af7ee9/src/infra/outbound/message-action-params.ts#L144-L148
P2 Badge Propagate remote-media SSRF policy in sandbox mode

hydrateAttachmentPayload now passes messages.remoteMedia.ssrfPolicy, but buildAttachmentMediaLoadOptions drops that policy when policy.mode === "sandbox", so sandboxed actions fetching http(s) media URLs still run with default SSRF checks. Since sandbox normalization explicitly allows remote URLs (resolveSandboxedMediaSource returns them unchanged), operators who set messages.remoteMedia.ssrfPolicy (for example allowRfc2544BenchmarkRange) get inconsistent behavior between host and sandbox paths and can still hit avoidable SSRF blocks.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +24 to +25
/** If true, permit RFC 2544 benchmarking addresses that are otherwise blocked as special-use IPs. */
allowRfc2544BenchmarkRange?: boolean;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Either honor or reject browser allowRfc2544BenchmarkRange

This commit exposes browser.ssrfPolicy.allowRfc2544BenchmarkRange in config/types/docs, but the browser runtime resolver (extensions/browser/src/browser/config.ts, resolveBrowserSsrFPolicy) never reads or forwards that field, so accepted config is silently ignored. That creates a misleading no-op knob for browser networking: users can set the value successfully but browser SSRF behavior does not change.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in e3329c8. resolveBrowserSsrFPolicy() now forwards allowRfc2544BenchmarkRange, and browser config tests cover both the normal configured case and strict mode with benchmark-range allowance preserved.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Mar 27, 2026

Greptile Summary

This PR converts a previous dist-level SSRF hotfix into a source-level, config-driven, opt-in implementation, threading a shared ssrfPolicy config surface through three independent paths: tools.web.fetch.ssrfPolicy, tools.web.search.citationRedirect.ssrfPolicy, and messages.remoteMedia.ssrfPolicy. It also patches the legacy web-search normalizer to preserve the citationRedirect nested object.

Key changes:

  • Introduces a shared SsrFPolicySchema (Zod) and SsrFPolicyConfig TypeScript type reused across browser, tools, and messages layers, adding allowRfc2544BenchmarkRange to all three.
  • Removes the "policy" exclusion from WebToolEndpointFetchOptions so callers can supply a policy to both withStrictWebToolsEndpoint (pass-through) and withTrustedWebToolsEndpoint (merge with hardcoded trusted policy winning on overlapping keys — safe spread order).
  • Plumbs messages.remoteMedia.ssrfPolicy consistently through all five channel senders (Mattermost, MSTeams, Signal, Zalouser, core action hydration).
  • Adds "citationRedirect" to GENERIC_WEB_SEARCH_KEYS — without this, the nested record was silently dropped by the normalizer because only scalar values and known-key objects were preserved.
  • Coverage is solid: new focused tests for guarded-fetch policy forwarding, citation-redirect policy forwarding, legacy normalizer retention, and outbound-media passthrough.

No broad hardcoded allowlists, no domain exceptions, and no unrelated scope creep — the PR stays exactly within its stated goals.

Confidence Score: 5/5

Safe to merge; all changes are strictly opt-in and default behavior is unchanged.

All three SSRF policy paths are correctly implemented and tested. The spread order in withTrustedWebToolsEndpoint ensures the hardcoded trusted policy cannot be weakened by callers. The SsrFPolicy / SsrFPolicyConfig cast is safe because the types are structurally identical. The legacy normalization fix is correct and well-tested. No regressions are expected since the feature is opt-in (undefined by default). The two P2 style comments are documentation/maintenance suggestions and do not block merge.

No files require special attention.

Important Files Changed

Filename Overview
src/agents/tools/web-guarded-fetch.ts Exposes policy on WebToolEndpointFetchOptions; withTrustedWebToolsEndpoint merges caller policy under the hardcoded trusted policy (safe spread order), withStrictWebToolsEndpoint passes policy through unchanged.
src/agents/tools/web-fetch.ts Adds resolveFetchSsrfPolicy helper and threads ssrfPolicy from config into fetchWithWebToolsNetworkGuard (strict mode). Pattern is consistent with existing resolver helpers in the file.
src/agents/tools/web-search-citation-redirect.ts Adds optional ssrfPolicy param to resolveCitationRedirectUrl and forwards it as policy to withStrictWebToolsEndpoint. Clean, minimal change.
src/config/legacy-web-search.ts Adds "citationRedirect" to GENERIC_WEB_SEARCH_KEYS so the nested object is preserved during normalization instead of being silently dropped.
src/config/zod-schema.core.ts Introduces shared SsrFPolicySchema (strict + optional) reused across browser, fetch, search, and messages schemas, replacing an inline definition and adding allowRfc2544BenchmarkRange.
src/config/types.browser.ts Renames BrowserSsrFPolicyConfig to SsrFPolicyConfig (keeps alias), adds allowRfc2544BenchmarkRange so the config type matches the runtime SsrFPolicy type exactly.
src/infra/outbound/message-action-params.ts Plumbs cfg.messages.remoteMedia.ssrfPolicy into buildAttachmentMediaLoadOptions; correctly scoped to the non-sandbox branch (sandbox path makes no outbound network requests).
extensions/google/src/gemini-web-search-provider.ts Threads citationRedirectSsrFPolicy from searchConfig.citationRedirect.ssrfPolicy into runGeminiSearch and forwards it to resolveCitationRedirectUrl.

Comments Outside Diff (1)

  1. src/agents/tools/web-fetch.ts, line 163-169 (link)

    P2 Runtime cast from SsrFPolicyConfig to SsrFPolicy

    SsrFPolicyConfig (config layer, types.browser.ts) and SsrFPolicy (infra layer, ssrf.ts) are currently structurally identical, so the as SsrFPolicy cast is safe. However, the two types now have to be kept in sync manually — a field added to one must be added to the other, or the cast will silently drop it at runtime.

    This is a known trade-off of the config/infra layer separation, but worth calling out. If the circular-dependency constraint allows it, importing SsrFPolicy directly into types.browser.ts (or re-exporting from a shared barrel) would eliminate the need for this cast entirely.

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: src/agents/tools/web-fetch.ts
    Line: 163-169
    
    Comment:
    **Runtime cast from `SsrFPolicyConfig` to `SsrFPolicy`**
    
    `SsrFPolicyConfig` (config layer, `types.browser.ts`) and `SsrFPolicy` (infra layer, `ssrf.ts`) are currently structurally identical, so the `as SsrFPolicy` cast is safe. However, the two types now have to be kept in sync manually — a field added to one must be added to the other, or the cast will silently drop it at runtime.
    
    This is a known trade-off of the config/infra layer separation, but worth calling out. If the circular-dependency constraint allows it, importing `SsrFPolicy` directly into `types.browser.ts` (or re-exporting from a shared barrel) would eliminate the need for this cast entirely.
    
    How can I resolve this? If you propose a fix, please make it concise.
Prompt To Fix All With AI
This is a comment left during a code review.
Path: src/agents/tools/web-guarded-fetch.ts
Line: 71-74

Comment:
**`withTrustedWebToolsEndpoint` silently discards caller-supplied restrictions**

The merge spreads `params.policy` first and then `WEB_TOOLS_TRUSTED_NETWORK_SSRF_POLICY` on top, so any property explicitly set in the hardcoded policy (`dangerouslyAllowPrivateNetwork: true`, `allowRfc2544BenchmarkRange: true`) will **always win**, even if a caller passes `{ dangerouslyAllowPrivateNetwork: false }`.

For the trusted endpoint this is intentional — callers cannot accidentally weaken it — but the current type signature (`WebToolEndpointFetchOptions` now allows `policy`) gives no indication of this precedence rule. Consider adding a JSDoc note on the function to make the "hardcoded policy always overrides" contract explicit, so future callers understand that `policy` is additive (hostname allowlists survive) but cannot override the trust knobs.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: src/agents/tools/web-fetch.ts
Line: 163-169

Comment:
**Runtime cast from `SsrFPolicyConfig` to `SsrFPolicy`**

`SsrFPolicyConfig` (config layer, `types.browser.ts`) and `SsrFPolicy` (infra layer, `ssrf.ts`) are currently structurally identical, so the `as SsrFPolicy` cast is safe. However, the two types now have to be kept in sync manually — a field added to one must be added to the other, or the cast will silently drop it at runtime.

This is a known trade-off of the config/infra layer separation, but worth calling out. If the circular-dependency constraint allows it, importing `SsrFPolicy` directly into `types.browser.ts` (or re-exporting from a shared barrel) would eliminate the need for this cast entirely.

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "Config: plumb opt-in SSRF policy" | Re-trigger Greptile

Comment on lines +71 to +74
policy: {
...(params.policy ?? {}),
...WEB_TOOLS_TRUSTED_NETWORK_SSRF_POLICY,
},
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 withTrustedWebToolsEndpoint silently discards caller-supplied restrictions

The merge spreads params.policy first and then WEB_TOOLS_TRUSTED_NETWORK_SSRF_POLICY on top, so any property explicitly set in the hardcoded policy (dangerouslyAllowPrivateNetwork: true, allowRfc2544BenchmarkRange: true) will always win, even if a caller passes { dangerouslyAllowPrivateNetwork: false }.

For the trusted endpoint this is intentional — callers cannot accidentally weaken it — but the current type signature (WebToolEndpointFetchOptions now allows policy) gives no indication of this precedence rule. Consider adding a JSDoc note on the function to make the "hardcoded policy always overrides" contract explicit, so future callers understand that policy is additive (hostname allowlists survive) but cannot override the trust knobs.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/tools/web-guarded-fetch.ts
Line: 71-74

Comment:
**`withTrustedWebToolsEndpoint` silently discards caller-supplied restrictions**

The merge spreads `params.policy` first and then `WEB_TOOLS_TRUSTED_NETWORK_SSRF_POLICY` on top, so any property explicitly set in the hardcoded policy (`dangerouslyAllowPrivateNetwork: true`, `allowRfc2544BenchmarkRange: true`) will **always win**, even if a caller passes `{ dangerouslyAllowPrivateNetwork: false }`.

For the trusted endpoint this is intentional — callers cannot accidentally weaken it — but the current type signature (`WebToolEndpointFetchOptions` now allows `policy`) gives no indication of this precedence rule. Consider adding a JSDoc note on the function to make the "hardcoded policy always overrides" contract explicit, so future callers understand that `policy` is additive (hostname allowlists survive) but cannot override the trust knobs.

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in e3329c8. The sandbox branch now preserves messages.remoteMedia.ssrfPolicy, so remote http(s) media keeps the configured SSRF policy in both host and sandbox paths.

@code-sayuri code-sayuri force-pushed the fix/config-driven-ssrf-policy branch 2 times, most recently from 4f263f4 to e3329c8 Compare March 27, 2026 02:44
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e3329c854f

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@@ -539,6 +549,7 @@ async function runWebFetch(params: WebFetchRuntimeParams): Promise<Record<string
url: params.url,
maxRedirects: params.maxRedirects,
timeoutSeconds: params.timeoutSeconds,
policy: params.ssrfPolicy,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Include SSRF policy in web_fetch cache key

runWebFetch now applies params.ssrfPolicy to guarded fetches, but the cache identity is still only url/extractMode/maxChars, so cache hits occur before SSRF checks. In a long-lived process where one tool instance uses a permissive tools.web.fetch.ssrfPolicy (or config is reloaded) and another uses strict settings, strict calls can receive permissively fetched cached content for the TTL window. Please include a stable SSRF-policy fingerprint in the cache key (or skip cache when policies differ).

Useful? React with 👍 / 👎.

@@ -221,6 +225,7 @@ function createGeminiToolDefinition(
apiKey,
model,
timeoutSeconds: resolveSearchTimeoutSeconds(searchConfig),
citationRedirectSsrFPolicy: searchConfig?.citationRedirect?.ssrfPolicy,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Add citation redirect policy to Gemini cache identity

This change introduces citationRedirectSsrFPolicy for redirect resolution, but the Gemini cache key still omits that policy and only keys on provider/query/count/model. If a permissive tools.web.search.citationRedirect.ssrfPolicy run populates cache, later stricter runs can reuse those cached citations without reapplying strict redirect checks. Add a normalized citation-redirect SSRF policy component to the cache key.

Useful? React with 👍 / 👎.

@code-sayuri code-sayuri force-pushed the fix/config-driven-ssrf-policy branch from e3329c8 to d84a762 Compare March 27, 2026 03:04
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d84a762a57

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines 188 to +192
.strict()
.optional(),
suppressToolErrors: z.boolean().optional(),
remoteMedia: z
.object({
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Wire remoteMedia SSRF policy through all send paths

This commit introduces messages.remoteMedia.ssrfPolicy as a global outbound-media setting, but only a subset of senders actually forward it; other production paths still fetch media without that policy (for example extensions/slack/src/send.ts and extensions/telegram/src/send.ts call loadWebMedia without an SSRF override). In deployments that set messages.remoteMedia.ssrfPolicy to allow RFC2544/private ranges for legitimate hosts, those channels will still fail because the config is silently ignored, creating channel-dependent behavior that is hard to diagnose.

Useful? React with 👍 / 👎.

@code-sayuri code-sayuri force-pushed the fix/config-driven-ssrf-policy branch 2 times, most recently from b379e1d to 9233351 Compare March 27, 2026 03:55
@clawsweeper
Copy link
Copy Markdown
Contributor

clawsweeper Bot commented Apr 30, 2026

Codex review: found issues before merge.

Summary
The PR adds opt-in SSRF policy config/schema/runtime plumbing for web_fetch, Gemini citation redirects, remote outbound media, and legacy web-search normalization.

Reproducibility: yes. The reproduction path is code-based: permissive SSRF policy can populate web_fetch or Gemini search caches, then stricter calls can read policy-agnostic cached data before the guard reruns; remoteMedia inconsistency is visible from send paths that omit the policy option.

Next step before merge
Security-sensitive SSRF config semantics and stale current-main overlap need maintainer review before a repair lane should rewrite the PR.

Security
Needs attention: The diff changes SSRF controls and still has concrete cache-isolation and partial-enforcement concerns.

Review findings

  • [P1] Include SSRF policy in the web_fetch cache key — src/agents/tools/web-fetch.ts:549
  • [P2] Add citation redirect policy to Gemini cache identity — extensions/google/src/gemini-web-search-provider.ts:225
  • [P2] Wire remoteMedia policy through every media sender — src/config/zod-schema.session.ts:188-193
Review details

Best possible solution:

Rebase onto current main, keep the cache-safe web_fetch implementation, and land the citationRedirect and remoteMedia policy surfaces only with policy-aware cache identity plus complete sender coverage.

Do we have a high-confidence way to reproduce the issue?

Yes. The reproduction path is code-based: permissive SSRF policy can populate web_fetch or Gemini search caches, then stricter calls can read policy-agnostic cached data before the guard reruns; remoteMedia inconsistency is visible from send paths that omit the policy option.

Is this the best way to solve the issue?

No. The direction is reasonable, but this PR is not the best current form until it preserves current main's cache isolation, adds citation redirect policy to Gemini cache identity, and completes remote-media routing.

Full review comments:

  • [P1] Include SSRF policy in the web_fetch cache key — src/agents/tools/web-fetch.ts:549
    The branch forwards params.ssrfPolicy into guarded fetches, but cache hits are still keyed without that policy. A permissive run can populate the cache and a later stricter run can return cached content before the SSRF guard runs, so the cache key needs a stable policy component or policy-sensitive bypass.
    Confidence: 0.93
  • [P2] Add citation redirect policy to Gemini cache identity — extensions/google/src/gemini-web-search-provider.ts:225
    The branch threads citationRedirectSsrFPolicy into redirect resolution, but Gemini search caching still ignores that policy. Resolved citations from a permissive run can be reused by stricter later calls without reapplying redirect SSRF checks.
    Confidence: 0.88
  • [P2] Wire remoteMedia policy through every media sender — src/config/zod-schema.session.ts:188-193
    The new messages.remoteMedia.ssrfPolicy setting is global, but not every outbound media sender is wired to pass it into media loading. Channels that omit the option will still enforce default SSRF behavior, creating channel-dependent config behavior.
    Confidence: 0.86

Overall correctness: patch is incorrect
Overall confidence: 0.89

Security concerns:

  • [high] Policy-agnostic caches can bypass stricter SSRF settings — src/agents/tools/web-fetch.ts:549
    The branch applies permissive SSRF policy during web_fetch and Gemini citation redirect resolution without including that policy in the relevant cache identities, allowing later stricter calls to receive previously cached results.
    Confidence: 0.92
  • [medium] Global remote media SSRF config is partially enforced — src/config/zod-schema.session.ts:188
    Accepting messages.remoteMedia.ssrfPolicy while leaving some channel media paths unwired creates a misleading security control whose behavior depends on the channel implementation.
    Confidence: 0.86

What I checked:

Likely related people:

  • steipete: Recent GitHub commit history shows repeated work on Gemini web search, config schemas, outbound media, and channel send paths affected by this PR. (role: recent maintainer and adjacent owner; confidence: high; commits: 66336bf7c846, 1143f7384233, 43ef8a5a8657; files: src/agents/tools/web-fetch.ts, extensions/google/src/gemini-web-search-provider.runtime.ts, src/config/zod-schema.agent-runtime.ts)
  • vincentkoc: Recent history includes web_fetch cache/provider maintenance and config work near the current implementation. (role: recent maintainer; confidence: medium; commits: dade5f913342, fbf9132b32da, 69b66dd5483e; files: src/agents/tools/web-fetch.ts, src/config/zod-schema.session.ts)
  • jeffrey701: Authored the current main SSRF fake-IP opt-in work that added policy-sensitive web_fetch behavior and tests. (role: introduced adjacent behavior; confidence: medium; commits: 9b6670d5c924; files: src/agents/tools/web-fetch.ts, src/config/zod-schema.agent-runtime.ts)

Remaining risk / open question:

  • The PR is stale relative to current main's narrower web_fetch SSRF implementation, so a rebase may need conflict resolution and security-policy review.
  • The proposed remoteMedia config is global, but the branch does not visibly prove every outbound media sender honors it consistently.

Codex review notes: model gpt-5.5, reasoning high; reviewed against a4f2bf273a21.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling channel: mattermost Channel integration: mattermost channel: msteams Channel integration: msteams channel: signal Channel integration: signal channel: zalouser Channel integration: zalouser docs Improvements or additions to documentation size: L

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant