Skip to content

🔨 chore(google): Support External URL file input with SSRF validation to optimize transmission#11507

Closed
sxjeru wants to merge 11 commits into
lobehub:mainfrom
sxjeru:115
Closed

🔨 chore(google): Support External URL file input with SSRF validation to optimize transmission#11507
sxjeru wants to merge 11 commits into
lobehub:mainfrom
sxjeru:115

Conversation

@sxjeru

@sxjeru sxjeru commented Jan 15, 2026

Copy link
Copy Markdown
Contributor

💻 Change Type

  • ✨ feat
  • 🐛 fix
  • ♻️ refactor
  • 💄 style
  • 👷 build
  • ⚡️ perf
  • ✅ test
  • 📝 docs
  • 🔨 chore

🔗 Related Issue

🔀 Description of Change

https://ai.google.dev/gemini-api/docs/file-input-methods#external-urls

添加对 External URL 文件输入方式的支持。非常适配目前使用 s3 的 LobeChat .
向 Gemini 上传图片和 PDF 文件不再需要转成 base64 传输,可减少服务端出口流量。

目前测试仅 Gemini 3 可用,但文档称 Gemini 2.5 也可用,目前添加了模型名限制,后续可再行观察。


同时将视频限制提升到 100 MB .

🧪 How to Test

  • Tested locally
  • Added/updated tests
  • No tests needed

📸 Screenshots / Videos

Before After
... ...

📝 Additional Information

Summary by Sourcery

Support Google Gemini External URL file input with SSRF-safe validation and graceful fallback to inline data.

Enhancements:

  • Add SSRF-safe validation utilities for External URL file inputs, including MIME type and size checks aligned with Google Gemini limits.
  • Update Google Gemini context builder to prefer External URLs for public image and video URLs, falling back to base64 inline data when validation fails or URLs are non-public.

@vercel

vercel Bot commented Jan 15, 2026

Copy link
Copy Markdown

@sxjeru is attempting to deploy a commit to the LobeHub OSS Team on Vercel.

A member of the Team first needs to authorize it.

@dosubot dosubot Bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Jan 15, 2026
@sourcery-ai

sourcery-ai Bot commented Jan 15, 2026

Copy link
Copy Markdown
Contributor

Reviewer's Guide

Adds support for using Google Gemini’s External URL file input with SSRF-safe validation, preferring direct file URLs over base64 inline data while enforcing Google’s content-type and size limits and providing fallbacks.

Sequence diagram for building Google part from image URL with external URL validation

sequenceDiagram
  actor User
  participant GoogleContextBuilderModule as GoogleContextBuilder
  participant UriParserModule as UriParser
  participant SsrfSafeFetchModule as SsrfSafeFetch
  participant GoogleApi as GoogleAPI

  User->>GoogleContextBuilder: send message with image URL
  GoogleContextBuilder->>GoogleContextBuilder: detect type url in image_url
  GoogleContextBuilder->>GoogleContextBuilder: url = content.image_url.url

  GoogleContextBuilder->>UriParser: isPublicExternalUrl(url)
  alt isPublicExternalUrl returns true
    GoogleContextBuilder->>UriParser: validateExternalUrl(url)
    UriParser->>SsrfSafeFetch: ssrfSafeFetch(url, HEAD options)
    SsrfSafeFetch-->>UriParser: HTTP response with headers
    UriParser-->>GoogleContextBuilder: ExternalUrlValidation

    alt validation.isValid is true
      GoogleContextBuilder->>GoogleAPI: send part with fileData.fileUri = url and mimeType = validation.contentType
    else validation.isValid is false
      GoogleContextBuilder->>UriParser: imageUrlToBase64(url)
      UriParser-->>GoogleContextBuilder: base64, mimeType
      GoogleContextBuilder->>GoogleAPI: send part with inlineData.data = base64 and inlineData.mimeType = mimeType
    end
  else isPublicExternalUrl returns false
    GoogleContextBuilder->>UriParser: imageUrlToBase64(url)
    UriParser-->>GoogleContextBuilder: base64, mimeType
    GoogleContextBuilder->>GoogleAPI: send part with inlineData.data = base64 and inlineData.mimeType = mimeType
  end
Loading

Class diagram for Google external URL validation and context building

classDiagram
  class UriParserModule {
    +parseDataUri(dataUri: string): UriParserResult
    +isPublicExternalUrl(url: string): boolean
    +validateExternalUrl(url: string): Promise<ExternalUrlValidation>
    +MAX_EXTERNAL_URL_SIZE: number
    +MAX_INLINE_DATA_SIZE: number
    +MAX_INLINE_PDF_SIZE: number
  }

  class ExternalUrlValidation {
    +contentLength: number
    +contentType: string
    +isValid: boolean
    +reason: string
  }

  class GoogleContextBuilderModule {
    +buildGooglePart(content: UserMessageContentPart, runtime: any, tools: any, provider: any): Promise<GooglePart>
  }

  class SsrfSafeFetchModule {
    +ssrfSafeFetch(url: string, options: SsrfSafeFetchOptions): Promise<Response>
  }

  UriParserModule ..> ExternalUrlValidation
  GoogleContextBuilderModule ..> UriParserModule
  UriParserModule ..> SsrfSafeFetchModule
  GoogleContextBuilderModule ..> SsrfSafeFetchModule
Loading

File-Level Changes

Change Details Files
Introduce shared utilities to validate external URLs against Google Gemini’s supported MIME types and file size limits using SSRF-safe HTTP requests.
  • Import ssrfSafeFetch and define Google-supported MIME type whitelist for External URL usage.
  • Define and export maximum size constants for inline data and PDFs and use a separate limit for external URLs.
  • Add ExternalUrlValidation interface and an isPublicExternalUrl helper that restricts URLs to http/https schemes.
  • Implement validateExternalUrl to issue a HEAD request via ssrfSafeFetch, read Content-Type/Content-Length headers, enforce type/size constraints, and return structured validation results including failure reasons.
packages/model-runtime/src/utils/uriParser.ts
Update Google Gemini context builder to prefer External URL (fileData.fileUri) for eligible image and video URLs, with SSRF-safe validation and base64 fallback.
  • Augment buildGooglePart with a TODO note about urlContext’s 34MB limit and potential future use of External URLs for larger files.
  • For image_url parts, first check if the URL is a public http/https URL, then validate it via validateExternalUrl and, if valid, build a fileData/fileUri part; otherwise fall back to SSRF-protected base64 conversion.
  • For video_url parts, apply the same public-URL and validation path (even though Google docs don’t yet list video support) and fall back to base64 conversion using imageUrlToBase64 for non-public or invalid URLs.
  • Adjust variable naming in image/video handlers to distinguish raw URL from base64-encoded data and associated MIME type.
packages/model-runtime/src/core/contextBuilders/google.ts

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@gru-agent

gru-agent Bot commented Jan 15, 2026

Copy link
Copy Markdown
Contributor

TestGru Assignment

Summary

Link CommitId Status Reason
Detail 5d8b1ee55c8dacd880a8a06ee7ec5a8e8ed1e4e3 ✅ Finished

History Assignment

Files

File Pull Request
packages/model-runtime/src/utils/uriParser.ts ❌ Failed (I failed to setup the environment.)
packages/model-runtime/src/core/contextBuilders/google.ts ❌ Failed (I failed to setup the environment.)

Tip

You can @gru-agent and leave your feedback. TestGru will make adjustments based on your input

@dosubot dosubot Bot added the Model Provider Model provider related label Jan 15, 2026

@sourcery-ai sourcery-ai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 2 issues, and left some high level feedback:

  • The isPublicExternalUrl helper currently only checks the protocol; consider explicitly rejecting localhost, private-network, or non-DNS hosts so the name better matches its behavior and avoids relying solely on ssrfSafeFetch.
  • Using HEAD in validateExternalUrl may fail on servers that don’t implement it correctly; you might want to fall back to a GET with method: 'GET', signal and redirect: 'manual' or similar if the initial HEAD request returns 405/501.
  • The Google-specific constraints (GOOGLE_EXTERNAL_URL_SUPPORTED_TYPES, MAX_* limits) are now embedded in uriParser.ts; consider moving them to a separate Gemini/Google-specific config module to keep URI parsing utilities more generic.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The `isPublicExternalUrl` helper currently only checks the protocol; consider explicitly rejecting localhost, private-network, or non-DNS hosts so the name better matches its behavior and avoids relying solely on `ssrfSafeFetch`.
- Using `HEAD` in `validateExternalUrl` may fail on servers that don’t implement it correctly; you might want to fall back to a `GET` with `method: 'GET', signal` and `redirect: 'manual'` or similar if the initial `HEAD` request returns 405/501.
- The Google-specific constraints (`GOOGLE_EXTERNAL_URL_SUPPORTED_TYPES`, `MAX_*` limits) are now embedded in `uriParser.ts`; consider moving them to a separate Gemini/Google-specific config module to keep URI parsing utilities more generic.

## Individual Comments

### Comment 1
<location> `packages/model-runtime/src/utils/uriParser.ts:121-130` </location>
<code_context>
+      };
+    }
+
+    const contentLength = Number.parseInt(res.headers.get('content-length') || '0', 10);
+    const contentType = (res.headers.get('content-type') || '').split(';')[0].trim().toLowerCase();
+
+    // Check MIME type support
+    if (!GOOGLE_EXTERNAL_URL_SUPPORTED_TYPES.has(contentType)) {
+      return {
+        contentLength,
+        contentType,
+        isValid: false,
+        reason: `Unsupported content type: ${contentType}`,
+      };
+    }
+
+    // Check file size - External URLs support 100MB for all file types
+    // (Unlike inline data where PDFs are limited to 50MB)
+    if (contentLength > MAX_EXTERNAL_URL_SIZE) {
+      return {
+        contentLength,
</code_context>

<issue_to_address>
**issue (bug_risk):** Treat missing or non-numeric Content-Length more explicitly to avoid silently bypassing size checks.

Because `content-length` defaults to `0` when missing or non-numeric, the size limit check is effectively bypassed and very large files may be accepted. Consider either rejecting URLs with missing/invalid `content-length`, or returning a distinct `reason` indicating the size couldn’t be determined instead of treating it as `0`.
</issue_to_address>

### Comment 2
<location> `packages/model-runtime/src/core/contextBuilders/google.ts:60-74` </location>
<code_context>
+
+        // Try to use External URL feature for public URLs to avoid re-uploading
+        // This allows Google to fetch the file directly, reducing transfer costs
+        if (isPublicExternalUrl(url)) {
+          const validation = await validateExternalUrl(url);
+          if (validation.isValid) {
+            return {
+              fileData: {
</code_context>

<issue_to_address>
**suggestion (performance):** Calling `validateExternalUrl` for videos currently always fails due to the supported MIME set, adding extra latency.

For `video_url`, `validateExternalUrl` checks `GOOGLE_EXTERNAL_URL_SUPPORTED_TYPES`, which only includes text/image/pdf, so all video URLs will always fail validation and fall back to base64 after an extra HEAD request. To avoid this unnecessary round-trip, consider either gating the validation behind a flag/config for videos, skipping validation for videos until Google supports them, or updating the supported types to include `video/*` once that support is available.

```suggestion
        // Try to use External URL feature for public URLs to avoid re-uploading.
        // This allows Google to fetch the file directly, reducing transfer costs.
        // NOTE: We only validate external URLs for images; videos are not yet supported
        // by GOOGLE_EXTERNAL_URL_SUPPORTED_TYPES and would always fail validation,
        // causing an unnecessary HEAD request before falling back to base64.
        if (content.type === 'image_url' && isPublicExternalUrl(url)) {
          const validation = await validateExternalUrl(url);
          if (validation.isValid) {
            return {
              fileData: {
                fileUri: url,
                mimeType: validation.contentType,
              },
              thoughtSignature: GEMINI_MAGIC_THOUGHT_SIGNATURE,
            };
          }
          // If validation fails, fall back to base64 conversion
        }
```
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment thread packages/model-runtime/src/utils/uriParser.ts

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5d8b1ee55c

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread packages/model-runtime/src/core/contextBuilders/google.ts Outdated
@codecov

codecov Bot commented Jan 15, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 41.72185% with 88 lines in your changes missing coverage. Please review.
✅ Project coverage is 75.78%. Comparing base (26ef2ff) to head (1579942).
⚠️ Report is 5 commits behind head on next.

Additional details and impacted files
@@            Coverage Diff             @@
##             next   #11507      +/-   ##
==========================================
- Coverage   75.88%   75.78%   -0.10%     
==========================================
  Files        1153     1153              
  Lines       90106    90298     +192     
  Branches    10140    10583     +443     
==========================================
+ Hits        68376    68436      +60     
- Misses      21640    21771     +131     
- Partials       90       91       +1     
Flag Coverage Δ
app 69.45% <66.66%> (-0.06%) ⬇️
database 93.55% <ø> (ø)
packages/agent-runtime 89.08% <ø> (ø)
packages/context-engine 82.52% <ø> (ø)
packages/conversation-flow 92.41% <ø> (ø)
packages/file-loaders 88.66% <ø> (ø)
packages/memory-user-memory 70.07% <ø> (ø)
packages/model-bank 100.00% <ø> (ø)
packages/model-runtime 86.08% <41.21%> (-0.51%) ⬇️
packages/prompts 77.43% <ø> (ø)
packages/python-interpreter 92.90% <ø> (ø)
packages/ssrf-safe-fetch 0.00% <ø> (ø)
packages/utils 93.25% <ø> (ø)
packages/web-crawler 95.62% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
Store 67.92% <ø> (-0.18%) ⬇️
Services 50.93% <66.66%> (+0.01%) ⬆️
Server 73.61% <ø> (ø)
Libs 40.75% <ø> (+0.09%) ⬆️
Utils 93.51% <ø> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@sxjeru

sxjeru commented Jan 15, 2026

Copy link
Copy Markdown
Contributor Author

Using fileUri to replace inlineData, the request can be completed normally.

image

This comment was translated by Claude.

Original Content 用 fileUri 替换 inlineData,可正常完成请求。 image

Copilot AI review requested due to automatic review settings January 29, 2026 09:13

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request adds support for Google Gemini's External URL file input feature with SSRF-safe validation, increases the video file size limit from 20MB to 100MB, and introduces support for codexMaxReasoningEffort parameter for GPT-5.1 Codex Max and GPT-5.2 Codex models.

Changes:

  • Added SSRF-safe validation utilities for External URL file inputs with MIME type and size checks aligned with Google Gemini API limits
  • Updated Google Gemini context builder to prefer External URLs for public image/video URLs on Gemini 3+, with graceful fallback to base64 inline data
  • Increased video file size limit from 20MB to 100MB across validation logic, localization, and tests
  • Added codexMaxReasoningEffort parameter support for GPT-5.1 Codex Max and GPT-5.2 Codex models with UI controls and configuration

Reviewed changes

Copilot reviewed 20 out of 20 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
packages/model-runtime/src/utils/uriParser.ts Adds SSRF-safe external URL validation with MIME type and size limit checks for Google Gemini
packages/model-runtime/src/core/contextBuilders/google.ts Implements External URL feature for images/videos with model version detection and fallback to inline data
packages/model-runtime/src/core/contextBuilders/google.test.ts Adds comprehensive test coverage for external URL validation and fallback scenarios
packages/model-runtime/src/providers/google/index.ts Passes model parameter to context builders to enable model-specific URL handling
packages/utils/src/client/videoValidation.ts Updates video size limit constant from 20MB to 100MB
packages/utils/src/client/videoValidation.test.ts Updates test values to reflect new 100MB limit
packages/types/src/agent/chatConfig.ts Adds codexMaxReasoningEffort type definitions and schema
src/services/chat/mecha/modelParamsResolver.ts Implements codexMaxReasoningEffort parameter resolution
src/features/ChatInput/ActionBar/Model/CodexMaxReasoningEffortSlider.tsx Adds slider component for codexMaxReasoningEffort control
src/features/ChatInput/ActionBar/Model/ControlsForm.tsx Integrates CodexMaxReasoningEffortSlider into control form
src/app/[variants]/(main)/settings/provider/features/ModelList/CreateNewModelModal/ExtendParamsSelect.tsx Adds codexMaxReasoningEffort to extended parameters options
packages/model-runtime/src/const/models.ts Adds gpt-5.1-codex-max and gpt-5.2-codex to responses API models list
packages/model-bank/src/types/aiModel.ts Adds codexMaxReasoningEffort to ExtendParamsType
packages/model-bank/src/aiModels/openai.ts Adds GPT-5.1 Codex Max and GPT-5.2 Codex model configurations
locales/*/chat.json Updates video size limit error messages from 20MB to 100MB
locales/*/modelProvider.json Adds translations for codexMaxReasoningEffort hint
src/locales/default/chat.ts Updates video size limit message from 20MB to 100MB
src/locales/default/modelProvider.ts Adds hint text for codexMaxReasoningEffort option

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread packages/model-runtime/src/core/contextBuilders/google.test.ts
Comment thread packages/model-runtime/src/utils/uriParser.ts
Comment thread packages/model-runtime/src/utils/uriParser.ts Outdated

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

@lobehubbot

Copy link
Copy Markdown
Member

Closing: this PR has merge conflicts with main and is outdated. A new test PR may be created for this module.

@lobehubbot lobehubbot closed this Feb 9, 2026
@gru-agent

gru-agent Bot commented Feb 9, 2026

Copy link
Copy Markdown
Contributor

⏳ Processing in progress

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Model Provider Model provider related size:L This PR changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants