feat: provide unified thinking config among anthropic and gemini models by hustxiayang · Pull Request #1461 · envoyproxy/ai-gateway

hustxiayang · 2025-10-29T13:37:30Z

Description
thinking_config among anthropic and gemini models are similar, for example: thinking and thinking_config; budget_tokens and thinkingBudget. Our users do not need to set up different thinking configs for different models as we can provide a unified interface among different providers.

Related Issues/PRs (if applicable)

Related to #1463

codecov-commenter · 2025-10-29T14:25:12Z

Codecov Report

❌ Patch coverage is 50.72464% with 34 lines in your changes missing coverage. Please review.
✅ Project coverage is 84.03%. Comparing base (9c7af75) to head (44b9efd).

Files with missing lines	Patch %	Lines
internal/apischema/openai/openai.go	0.00%	25 Missing ⚠️
internal/translator/openai_gcpvertexai.go	64.28%	3 Missing and 2 partials ⚠️
internal/translator/openai_awsbedrock.go	87.50%	1 Missing and 1 partial ⚠️
internal/translator/openai_gcpanthropic.go	85.71%	1 Missing and 1 partial ⚠️

❌ Your patch status has failed because the patch coverage (50.72%) is below the target coverage (80.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1461      +/-   ##
==========================================
- Coverage   84.22%   84.03%   -0.20%     
==========================================
  Files         141      141              
  Lines       12978    13039      +61     
==========================================
+ Hits        10931    10957      +26     
- Misses       1428     1460      +32     
- Partials      619      622       +3

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

hustxiayang · 2025-10-29T14:36:52Z

/retest

docs/proposals/004-vendor-specific-fields/proposal.md

mathetake · 2025-10-29T15:51:36Z

so if i understand correctly, this goes completely beyond the original @sukumargaonkar 's proposal https://github.com/envoyproxy/ai-gateway/blob/main/docs/proposals/004-vendor-specific-fields/proposal.md

This is trying to define "AI Gateway Schema", and the scope and impact is huge. Could you write up a design doc and have more comments from community here?

At the very least, I would request for very detailed documents and future planning on this direction as well as this kind of thing requires user facing documentation in the website, not just code. sg?

mathetake · 2025-10-29T15:53:07Z

Please note that breaking change regarding this is extremely impactful than the control plane API. Once this is exposed, you cannot assume that clients will migrate properly, so we have to be very careful

hustxiayang · 2025-10-29T15:55:01Z

@mathetake Sure, thanks a lot for the comments! One key motivation is that we do not want to break the existing codes internally. Would prepare a doc also. Thanks!

yuzisun · 2025-10-29T16:09:37Z

so if i understand correctly, this goes completely beyond the original @sukumargaonkar 's proposal https://github.com/envoyproxy/ai-gateway/blob/main/docs/proposals/004-vendor-specific-fields/proposal.md

This is trying to define "AI Gateway Schema", and the scope and impact is huge. Could you write up a design doc and have more comments from community here?

At the very least, I would request for very detailed documents and future planning on this direction as well as this kind of thing requires user facing documentation in the website, not just code. sg?

@mathetake It is not against @sukumargaonkar's original proposal for vendor specific fields. However for thinking we found out that different providers have very similar looking definitions, so it is worth unifying at the gateway level. @hustxiayang will provider a detailed writeup.

yuzisun · 2025-10-29T16:12:26Z

Please note that breaking change regarding this is extremely impactful than the control plane API. Once this is exposed, you cannot assume that clients will migrate properly, so we have to be very careful

agreed, I think we can still keep the GCP thinking vendor field but recommend the new unified thinking API shape.

mathetake · 2025-10-29T16:14:26Z

ah sorry i should clarify that my comment about breaking change is about this new "unified api portion", not about vendor stuff already in place, and that's why I am saying we should not take this addition lightly without any very careful consideration.

mathetake · 2025-10-29T16:15:23Z

basically API added here in this PRs will never be reverted without very long migration period (think about openai/anthropic, or any other public API's deprecation window)

yuzisun · 2025-10-29T16:41:33Z

basically API added here in this PRs will never be reverted without very long migration period (think about openai/anthropic, or any other public API's deprecation window)

Adding an agenda tomorrow to discuss the deprecation policy.

nacx · 2025-10-29T16:57:54Z

Premature abstractions can come at a significant maintenance cost, so we need proper, informed decisions before pushing for them.

thinking_config among different providers are similar, for example: thinking and thinking_config; budget_tokens and thinkingBudget. Our users do not need to set up different thinking configs for different models as we can provide a unified interface among different providers.

This PR description is quite vague and doesn't give a good understanding on what will be covered in reality and the real user benefits. Please edit the description and add a detailed breakdown of the providers that are supported today and how their APIs look like. Let's include all all the currently supported providers, so we can get a good understanding on what this change would really cover (because not all of them support the same features through the OpenAI comapt APIs. For example, Cohere does not support the reasoning_effort field).

That would help. lot understand what this really covers and what it's optimizing for.

mathetake · 2025-10-30T03:23:49Z

yeah so i won't be able to make it to the meeting tomorrow (sorry 🙏) but as long as the deprecation policy is agreed upon in a sane way as well as the API design is reviewed by multiple people given the prior art research (not limited to litellm), then all good to me.

of course the user facing documentation is a must but that's after the agreement we can do that

hustxiayang · 2025-11-10T22:24:46Z

@mathetake @yuzisun @nacx It's been a while after the discussion. Any plans/concerns to merge this PR?

#1554) **Description** Some new features were introduced in gemini3: **1 thinking_level:** https://ai.google.dev/gemini-api/docs/gemini-3?thinking=low#thinking_level This is similar to reasoning_effort of openai, thus, unified them. **2 media_resolution** https://ai.google.dev/gemini-api/docs/gemini-3?thinking=low#media_resolution This is similar to detail in openai, thus, unified them. The difference is that openai does not provide a global config of media_resolution. Thus, added it as gcp specific, but still use detail to make the name consistent. **Some related PRs:** thinking_budget is in #1461 thinking_level and thinking_budget are both supported, but can not use them together. Other features under review: **1 web search:** #1526 **2 parse the thought summary:** #1521 --------- Signed-off-by: yxia216 <yxia216@bloomberg.net> Signed-off-by: Alexa Griffith <agriffith50@bloomberg.net> Signed-off-by: Sukumar Gaonkar <sgaonkar4@bloomberg.net> Co-authored-by: Alexa Griffith <agriffith50@bloomberg.net> Co-authored-by: Sukumar Gaonkar <sgaonkar4@bloomberg.net> Co-authored-by: Ignasi Barrera <ignasi@tetrate.io>

Signed-off-by: yxia216 <yxia216@bloomberg.net>

mathetake · 2025-11-29T21:13:19Z

@yuzisun @hustxiayang this needs the end user documentation change https://aigateway.envoyproxy.io/docs/capabilities/llm-integrations/vendor-specific-fields

yuzisun · 2025-11-29T22:01:57Z

@yuzisun @hustxiayang this needs the end user documentation change https://aigateway.envoyproxy.io/docs/capabilities/llm-integrations/vendor-specific-fields

Updated #1590

…lude thought is true (#1521) **Description** When `includeThought` is true, Gemini would also generate summary of thinking process. We need to parse out this kind of data to users. Otherwise, we would return thought process together with output to users. Depends/base or replace #1461 --------- Signed-off-by: yxia216 <yxia216@bloomberg.net>

envoyproxy#1554) **Description** Some new features were introduced in gemini3: **1 thinking_level:** https://ai.google.dev/gemini-api/docs/gemini-3?thinking=low#thinking_level This is similar to reasoning_effort of openai, thus, unified them. **2 media_resolution** https://ai.google.dev/gemini-api/docs/gemini-3?thinking=low#media_resolution This is similar to detail in openai, thus, unified them. The difference is that openai does not provide a global config of media_resolution. Thus, added it as gcp specific, but still use detail to make the name consistent. **Some related PRs:** thinking_budget is in envoyproxy#1461 thinking_level and thinking_budget are both supported, but can not use them together. Other features under review: **1 web search:** envoyproxy#1526 **2 parse the thought summary:** envoyproxy#1521 --------- Signed-off-by: yxia216 <yxia216@bloomberg.net> Signed-off-by: Alexa Griffith <agriffith50@bloomberg.net> Signed-off-by: Sukumar Gaonkar <sgaonkar4@bloomberg.net> Co-authored-by: Alexa Griffith <agriffith50@bloomberg.net> Co-authored-by: Sukumar Gaonkar <sgaonkar4@bloomberg.net> Co-authored-by: Ignasi Barrera <ignasi@tetrate.io> Signed-off-by: Erica Hughberg <erica.sundberg.90@gmail.com>

…ls (envoyproxy#1461) **Description** `thinking_config` among anthropic and gemini models are similar, for example: `thinking` and `thinking_config`; `budget_tokens` and `thinkingBudget`. Our users do not need to set up different thinking configs for different models as we can provide a unified interface among different providers. **Related Issues/PRs (if applicable)** Related to envoyproxy#1463 --------- Signed-off-by: yxia216 <yxia216@bloomberg.net> Co-authored-by: Dan Sun <dsun20@bloomberg.net> Signed-off-by: Erica Hughberg <erica.sundberg.90@gmail.com>

…lude thought is true (envoyproxy#1521) **Description** When `includeThought` is true, Gemini would also generate summary of thinking process. We need to parse out this kind of data to users. Otherwise, we would return thought process together with output to users. Depends/base or replace envoyproxy#1461 --------- Signed-off-by: yxia216 <yxia216@bloomberg.net> Signed-off-by: Erica Hughberg <erica.sundberg.90@gmail.com>

hustxiayang requested a review from a team as a code owner October 29, 2025 13:37

hustxiayang marked this pull request as draft October 29, 2025 13:37

hustxiayang force-pushed the unified_thinking_config branch from 3968526 to 0ebf149 Compare October 29, 2025 14:04

hustxiayang changed the title ~~Unified thinking config among different providers~~ feat: unified thinking config among different providers Oct 29, 2025

hustxiayang force-pushed the unified_thinking_config branch from 0ebf149 to cf3362d Compare October 29, 2025 14:22

hustxiayang changed the title ~~feat: unified thinking config among different providers~~ feat: provide unified thinking config among different providers Oct 29, 2025

hustxiayang marked this pull request as ready for review October 29, 2025 15:33

gavrissh reviewed Oct 29, 2025

View reviewed changes

docs/proposals/004-vendor-specific-fields/proposal.md Show resolved Hide resolved

hustxiayang changed the title ~~feat: provide unified thinking config among different providers~~ feat: provide unified thinking config among anthropic and gemini models Oct 29, 2025

hustxiayang mentioned this pull request Nov 9, 2025

feat: parse out thought summary from gemini models' response when include thought is true #1521

Merged

dosubot bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Nov 10, 2025

hustxiayang mentioned this pull request Nov 20, 2025

feat: add new features(thinking_level and media_resolution) of gemini3 #1554

Merged

hustxiayang added 4 commits November 25, 2025 00:59

init

d960f7e

Signed-off-by: yxia216 <yxia216@bloomberg.net>

init

cc80400

Signed-off-by: yxia216 <yxia216@bloomberg.net>

fix-0

0af000e

Signed-off-by: yxia216 <yxia216@bloomberg.net>

rebase

6308918

Signed-off-by: yxia216 <yxia216@bloomberg.net>

update-doc

fb92a6d

Signed-off-by: yxia216 <yxia216@bloomberg.net>

hustxiayang force-pushed the unified_thinking_config branch from ab84e26 to fb92a6d Compare November 25, 2025 14:54

hustxiayang and others added 2 commits November 25, 2025 10:32

doc

1be7ab8

Signed-off-by: yxia216 <yxia216@bloomberg.net>

Merge branch 'main' into unified_thinking_config

44b9efd

yuzisun approved these changes Nov 29, 2025

View reviewed changes

yuzisun merged commit 96a2fdd into envoyproxy:main Nov 29, 2025
30 checks passed

Conversation

hustxiayang commented Oct 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov-commenter commented Oct 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

hustxiayang commented Oct 29, 2025

Uh oh!

Uh oh!

mathetake commented Oct 29, 2025

Uh oh!

mathetake commented Oct 29, 2025

Uh oh!

hustxiayang commented Oct 29, 2025

Uh oh!

yuzisun commented Oct 29, 2025

Uh oh!

yuzisun commented Oct 29, 2025

Uh oh!

mathetake commented Oct 29, 2025

Uh oh!

mathetake commented Oct 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yuzisun commented Oct 29, 2025

Uh oh!

nacx commented Oct 29, 2025

Uh oh!

mathetake commented Oct 30, 2025

Uh oh!

hustxiayang commented Nov 10, 2025

Uh oh!

Uh oh!

mathetake commented Nov 29, 2025

Uh oh!

yuzisun commented Nov 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

hustxiayang commented Oct 29, 2025 •

edited

Loading

codecov-commenter commented Oct 29, 2025 •

edited

Loading

mathetake commented Oct 29, 2025 •

edited

Loading