Add passthrough support for Gemini generateContent and streamGenerateContent APIs#19425
Conversation
4b14fae to
069efbe
Compare
|
Documentation preview for 880b16b is available at: More info
|
There was a problem hiding this comment.
Pull request overview
This PR adds passthrough support for Gemini's generateContent and streamGenerateContent APIs as part of a larger effort to provide native API support for multiple LLM providers. The implementation follows the patterns established in the stack for OpenAI and Anthropic passthrough endpoints.
Key Changes
- Added two new Gemini passthrough endpoints that accept raw Gemini API format
- Implemented LiteLLM provider as a fallback for unsupported providers
- Extended base provider class with passthrough methods for multiple providers
Reviewed changes
Copilot reviewed 14 out of 14 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| mlflow/server/gateway_api.py | Added Gemini passthrough endpoints (generateContent and streamGenerateContent) and helper function for extracting endpoint names |
| mlflow/gateway/providers/gemini.py | Implemented passthrough methods for Gemini generateContent APIs |
| mlflow/gateway/providers/base.py | Added abstract passthrough method definitions for all provider types |
| mlflow/gateway/providers/anthropic.py | Implemented Anthropic Messages API passthrough endpoint |
| mlflow/gateway/providers/openai.py | Implemented OpenAI passthrough endpoints for chat, embeddings, and responses APIs |
| mlflow/gateway/providers/litellm.py | New provider using LiteLLM library as fallback for unsupported providers |
| mlflow/gateway/config.py | Added LiteLLMConfig class and LITELLM provider enum value |
| mlflow/gateway/provider_registry.py | Registered LiteLLM provider in the provider registry |
| tests/server/test_gateway_api.py | Comprehensive tests for all new passthrough endpoints including streaming |
| tests/gateway/providers/test_gemini.py | Unit tests for Gemini passthrough methods |
| tests/gateway/providers/test_anthropic.py | Unit tests for Anthropic passthrough methods |
| tests/gateway/providers/test_openai.py | Unit tests for OpenAI passthrough methods |
| tests/gateway/providers/test_litellm.py | Complete test suite for new LiteLLM provider |
| docs/api_reference/api_inventory.txt | Updated API inventory with LiteLLMConfig references |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| } | ||
| response = await provider.passthrough_anthropic_messages(payload) | ||
|
|
||
| assert payload["model"] == "claude-2.1" |
There was a problem hiding this comment.
This assertion validates that the payload has been mutated in place by the passthrough_anthropic_messages method (the model field is added). While the test currently passes, asserting on side effects of the function under test rather than its return value makes the test more fragile. Consider testing only the return value and the mock call arguments.
| assert payload["model"] == "claude-2.1" |
| Accepts raw Anthropic request format and returns raw Anthropic response format. | ||
| Supports streaming if the 'stream' parameter is set to True. | ||
| """ | ||
| # Add model name from config |
There was a problem hiding this comment.
Mutating the input payload dictionary in place could cause unexpected side effects for callers. The payload is modified by adding the 'model' field, which could affect code that reuses the same dictionary. Consider creating a copy of the payload before modifying it, or explicitly document this mutation as part of the function's contract.
| # Add model name from config | |
| # Add model name from config | |
| payload = dict(payload) # Avoid mutating the input dictionary |
mlflow/gateway/providers/base.py
Outdated
| detail=f"The passthrough Anthropic messages route is not implemented for {self.NAME}" | ||
| "models.", |
There was a problem hiding this comment.
Missing space between string literals in error message. The error message will read "...for {self.NAME}models." instead of "...for {self.NAME} models."
| detail=f"The passthrough Anthropic messages route is not implemented for {self.NAME}" | |
| "models.", | |
| detail=f"The passthrough Anthropic messages route is not implemented for {self.NAME} models.", |
| if "api_base" in auth_config: | ||
| litellm_config["litellm_api_base"] = auth_config["api_base"] | ||
| provider_config = LiteLLMConfig(**litellm_config) | ||
| model_config.provider = Provider.LITELLM |
There was a problem hiding this comment.
Mutating the model_config.provider field could have unexpected side effects. The model_config object likely comes from the database store and modifying it in-place could affect other parts of the code that reference it. Consider creating a new variable or ensuring this mutation doesn't propagate beyond this function's scope.
| } | ||
| response = await provider.passthrough_anthropic_messages(payload) | ||
|
|
||
| assert payload["model"] == "claude-2.1" |
There was a problem hiding this comment.
This assertion validates that the payload has been mutated in place by the passthrough_anthropic_messages method (the model field is added). While the test currently passes, asserting on side effects of the function under test rather than its return value makes the test more fragile. Consider testing only the return value and the mock call arguments.
20b0386 to
cf3d4f8
Compare
d991ed4 to
f07f11e
Compare
B-Step62
left a comment
There was a problem hiding this comment.
LTGM, left a few minor comments
0a9ed11 to
b23b045
Compare
…Content APIs Signed-off-by: Tomu Hirata <tomu.hirata@gmail.com>
b23b045 to
880b16b
Compare
|
Bypassing unrelated test failures |
🥞 Stacked PR
Use this link to review incremental changes.
Related Issues/PRs
n/a
What changes are proposed in this pull request?
Add following Gemini model passthrough endpoints where request body is just propagated to the provider endpoint:
How is this PR tested?
Does this PR require documentation update?
Release Notes
Is this a user-facing change?
What component(s), interfaces, languages, and integrations does this PR affect?
Components
area/tracking: Tracking Service, tracking client APIs, autologgingarea/models: MLmodel format, model serialization/deserialization, flavorsarea/model-registry: Model Registry service, APIs, and the fluent client calls for Model Registryarea/scoring: MLflow Model server, model deployment tools, Spark UDFsarea/evaluation: MLflow model evaluation features, evaluation metrics, and evaluation workflowsarea/gateway: MLflow AI Gateway client APIs, server, and third-party integrationsarea/prompts: MLflow prompt engineering features, prompt templates, and prompt managementarea/tracing: MLflow Tracing features, tracing APIs, and LLM tracing functionalityarea/projects: MLproject format, project running backendsarea/uiux: Front-end, user experience, plotting, JavaScript, JavaScript dev serverarea/build: Build and test infrastructure for MLflowarea/docs: MLflow documentation pagesHow should the PR be classified in the release notes? Choose one:
rn/none- No description will be included. The PR will be mentioned only by the PR number in the "Small Bugfixes and Documentation Updates" sectionrn/breaking-change- The PR will be mentioned in the "Breaking Changes" sectionrn/feature- A new user-facing feature worth mentioning in the release notesrn/bug-fix- A user-facing bug fix worth mentioning in the release notesrn/documentation- A user-facing documentation change worth mentioning in the release notesShould this PR be included in the next patch release?
Yesshould be selected for bug fixes, documentation updates, and other small changes.Noshould be selected for new features and larger changes. If you're unsure about the release classification of this PR, leave this unchecked to let the maintainers decide.What is a minor/patch release?
Bug fixes, doc updates and new features usually go into minor releases.
Bug fixes and doc updates usually go into patch releases.