feat: enable enterprise web search for gemini models#1526
feat: enable enterprise web search for gemini models#1526yuzisun merged 12 commits intoenvoyproxy:mainfrom
Conversation
8f29ea8 to
c395cb0
Compare
Codecov Report❌ Patch coverage is
❌ Your patch status has failed because the patch coverage (73.33%) is below the target coverage (80.00%). You can increase the patch coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## main #1526 +/- ##
==========================================
- Coverage 83.26% 83.24% -0.03%
==========================================
Files 137 137
Lines 12351 12364 +13
==========================================
+ Hits 10284 10292 +8
- Misses 1440 1445 +5
Partials 627 627 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
c395cb0 to
da9bdfe
Compare
|
/retest |
1 similar comment
|
/retest |
#1554) **Description** Some new features were introduced in gemini3: **1 thinking_level:** https://ai.google.dev/gemini-api/docs/gemini-3?thinking=low#thinking_level This is similar to reasoning_effort of openai, thus, unified them. **2 media_resolution** https://ai.google.dev/gemini-api/docs/gemini-3?thinking=low#media_resolution This is similar to detail in openai, thus, unified them. The difference is that openai does not provide a global config of media_resolution. Thus, added it as gcp specific, but still use detail to make the name consistent. **Some related PRs:** thinking_budget is in #1461 thinking_level and thinking_budget are both supported, but can not use them together. Other features under review: **1 web search:** #1526 **2 parse the thought summary:** #1521 --------- Signed-off-by: yxia216 <yxia216@bloomberg.net> Signed-off-by: Alexa Griffith <agriffith50@bloomberg.net> Signed-off-by: Sukumar Gaonkar <sgaonkar4@bloomberg.net> Co-authored-by: Alexa Griffith <agriffith50@bloomberg.net> Co-authored-by: Sukumar Gaonkar <sgaonkar4@bloomberg.net> Co-authored-by: Ignasi Barrera <ignasi@tetrate.io>
|
This needs a documentation update: https://aigateway.envoyproxy.io/docs/capabilities/llm-integrations/vendor-specific-fields (could you also backfill the doc for other fields added recently that lack docs if any?) |
internal/apischema/openai/openai.go
Outdated
|
|
||
| // EnterpriseWebSearch controls whether to use Web Grounding for Enterprise | ||
| // https://docs.cloud.google.com/vertex-ai/generative-ai/docs/grounding/web-grounding-enterprise | ||
| EnterpriseWebSearch bool `json:"enterprise_search,omitzero"` |
There was a problem hiding this comment.
There is a list of supported tools, I think we should define a list here
https://github.com/googleapis/go-genai/blob/6a8184fcaf8bf15f0c566616a7b356560309be9b/types.go#L3934
There was a problem hiding this comment.
@yuzisun you mean this list: https://github.com/googleapis/go-genai/blob/6a8184fcaf8bf15f0c566616a7b356560309be9b/types.go#L1406? Do we have plans to support them?
There was a problem hiding this comment.
see how litellm supports the web search tool, I think we can extend the openai tool type?
https://docs.litellm.ai/docs/providers/vertex#enterprise-web-search
There was a problem hiding this comment.
const (
ToolTypeFunction ToolType = "function"
ToolTypeImageGeneration ToolType = "image_generation"
ToolTypeEnterpriseWebSearch = "enterprise_search"
)
type Tool struct {
Type ToolType `json:"type"`
Function *FunctionDefinition `json:"function,omitempty"`
}There was a problem hiding this comment.
I am not against to it, but this is actually another example that whether we need to make these "ventor-specific" or just make the interface more "unified". I would follow it if others have no concerns on it.
There was a problem hiding this comment.
I updated and followed your codes, but not litellm's. Users use tools=["type": "enterprise_search"]. I feel this is more natural.
23d2ea3 to
8df3860
Compare
Signed-off-by: yxia216 <yxia216@bloomberg.net>
8df3860 to
a451f7d
Compare
@mathetake Thanks a lot for your comment! @yuzisun started a pr #1590, so I left a comment about recent changes under that pr. |
|
/retest |
2 similar comments
|
/retest |
|
/retest |
Let's add the docs here to have this PR self-contained and rebase once that PR is merged. |
We will open separate PR for doc, in general we are missing all the doc for tool calls. |
| // https://cloud.google.com/vertex-ai/generative-ai/docs/reference/rest/v1/GenerateContentResponse#SafetyRating | ||
| SafetyRatings []*genai.SafetyRating `json:"safety_ratings,omitempty"` | ||
|
|
||
| // GroundingMetadata specifies sources used to ground generated content. |
There was a problem hiding this comment.
Can you add a reference link ?
There was a problem hiding this comment.
thanks a lot! Added a link
Signed-off-by: yxia216 <yxia216@bloomberg.net>
envoyproxy#1554) **Description** Some new features were introduced in gemini3: **1 thinking_level:** https://ai.google.dev/gemini-api/docs/gemini-3?thinking=low#thinking_level This is similar to reasoning_effort of openai, thus, unified them. **2 media_resolution** https://ai.google.dev/gemini-api/docs/gemini-3?thinking=low#media_resolution This is similar to detail in openai, thus, unified them. The difference is that openai does not provide a global config of media_resolution. Thus, added it as gcp specific, but still use detail to make the name consistent. **Some related PRs:** thinking_budget is in envoyproxy#1461 thinking_level and thinking_budget are both supported, but can not use them together. Other features under review: **1 web search:** envoyproxy#1526 **2 parse the thought summary:** envoyproxy#1521 --------- Signed-off-by: yxia216 <yxia216@bloomberg.net> Signed-off-by: Alexa Griffith <agriffith50@bloomberg.net> Signed-off-by: Sukumar Gaonkar <sgaonkar4@bloomberg.net> Co-authored-by: Alexa Griffith <agriffith50@bloomberg.net> Co-authored-by: Sukumar Gaonkar <sgaonkar4@bloomberg.net> Co-authored-by: Ignasi Barrera <ignasi@tetrate.io> Signed-off-by: Erica Hughberg <erica.sundberg.90@gmail.com>
**Description** This is to enable feature `Web Grounding for Enterprise`: https://docs.cloud.google.com/vertex-ai/generative-ai/docs/grounding/web-grounding-enterprise **Related Issues/PRs (if applicable)** envoyproxy#1417 --------- Signed-off-by: yxia216 <yxia216@bloomberg.net> Co-authored-by: Dan Sun <dsun20@bloomberg.net> Signed-off-by: Erica Hughberg <erica.sundberg.90@gmail.com>
…ni models (#1641) **Description** Adds support for Google Search grounding [1] as a tool type for Gemini models, complementing the existing enterprise web search support added in #1526. Enterprise search and Google Search are two different tools in Vertex. See [2] for more details. [1]: https://docs.cloud.google.com/vertex-ai/generative-ai/docs/grounding/grounding-with-google-search [2]: https://docs.cloud.google.com/vertex-ai/generative-ai/docs/grounding/web-grounding-enterprise#overview The implementation translates the google_search tool type to Gemini's GoogleSearch with support for all configuration options: - exclude_domains: Filter out specific domains from search results (Vertex AI only) - blocking_confidence: Set phishing block threshold (Vertex AI only) - time_range_filter: Restrict results to a time window (Gemini API only) Response grounding metadata is already handled by the existing GroundingMetadata field from #1526. **Related Issues/PRs (if applicable)** Related PR: #1526 **Special notes for reviewers (if applicable)** The config structs are prefixed with GCP (e.g., GCPGoogleSearchConfig, GCPTimeRangeFilter) to follow the existing pattern for vendor-specific extensions (GCPVertexAIVendorFields, GCPVertexAIGenerationConfig). --------- Signed-off-by: jamesbuddrige <jxbuddrige@hotmail.co.uk>
…ni models (envoyproxy#1641) **Description** Adds support for Google Search grounding [1] as a tool type for Gemini models, complementing the existing enterprise web search support added in envoyproxy#1526. Enterprise search and Google Search are two different tools in Vertex. See [2] for more details. [1]: https://docs.cloud.google.com/vertex-ai/generative-ai/docs/grounding/grounding-with-google-search [2]: https://docs.cloud.google.com/vertex-ai/generative-ai/docs/grounding/web-grounding-enterprise#overview The implementation translates the google_search tool type to Gemini's GoogleSearch with support for all configuration options: - exclude_domains: Filter out specific domains from search results (Vertex AI only) - blocking_confidence: Set phishing block threshold (Vertex AI only) - time_range_filter: Restrict results to a time window (Gemini API only) Response grounding metadata is already handled by the existing GroundingMetadata field from envoyproxy#1526. **Related Issues/PRs (if applicable)** Related PR: envoyproxy#1526 **Special notes for reviewers (if applicable)** The config structs are prefixed with GCP (e.g., GCPGoogleSearchConfig, GCPTimeRangeFilter) to follow the existing pattern for vendor-specific extensions (GCPVertexAIVendorFields, GCPVertexAIGenerationConfig). --------- Signed-off-by: jamesbuddrige <jxbuddrige@hotmail.co.uk>
…ni models (envoyproxy#1641) **Description** Adds support for Google Search grounding [1] as a tool type for Gemini models, complementing the existing enterprise web search support added in envoyproxy#1526. Enterprise search and Google Search are two different tools in Vertex. See [2] for more details. [1]: https://docs.cloud.google.com/vertex-ai/generative-ai/docs/grounding/grounding-with-google-search [2]: https://docs.cloud.google.com/vertex-ai/generative-ai/docs/grounding/web-grounding-enterprise#overview The implementation translates the google_search tool type to Gemini's GoogleSearch with support for all configuration options: - exclude_domains: Filter out specific domains from search results (Vertex AI only) - blocking_confidence: Set phishing block threshold (Vertex AI only) - time_range_filter: Restrict results to a time window (Gemini API only) Response grounding metadata is already handled by the existing GroundingMetadata field from envoyproxy#1526. **Related Issues/PRs (if applicable)** Related PR: envoyproxy#1526 **Special notes for reviewers (if applicable)** The config structs are prefixed with GCP (e.g., GCPGoogleSearchConfig, GCPTimeRangeFilter) to follow the existing pattern for vendor-specific extensions (GCPVertexAIVendorFields, GCPVertexAIGenerationConfig). --------- Signed-off-by: jamesbuddrige <jxbuddrige@hotmail.co.uk> Signed-off-by: yxia216 <yxia216@bloomberg.net>
Description
This is to enable feature
Web Grounding for Enterprise: https://docs.cloud.google.com/vertex-ai/generative-ai/docs/grounding/web-grounding-enterpriseRelated Issues/PRs (if applicable)
#1417