Skip to content

Add passthrough support for Anthropic Messages API#19423

Merged
TomeHirata merged 4 commits intomlflow:masterfrom
TomeHirata:stack/gateway/model-passthrough/anthropic
Dec 17, 2025
Merged

Add passthrough support for Anthropic Messages API#19423
TomeHirata merged 4 commits intomlflow:masterfrom
TomeHirata:stack/gateway/model-passthrough/anthropic

Conversation

@TomeHirata
Copy link
Collaborator

@TomeHirata TomeHirata commented Dec 16, 2025

🥞 Stacked PR

Use this link to review incremental changes.


Related Issues/PRs

n/a

What changes are proposed in this pull request?

Add an Anthropic model passthrough endpoint to support Anthropic client natively where request body is just propagated to the provider endpoint:

  • /gateway/anthropic/v1/messages
import mlflow

mlflow.set_tracking_uri("http://localhost:5000")
mlflow.set_experiment("Gateway")
from mlflow.tracking._tracking_service.utils import _get_store
from anthropic import Anthropic

store = _get_store()

secret = store.create_gateway_secret(
    secret_name="anthropic_api",
    secret_value={"api_key": "<secret>"},
    provider="anthropic",
)
model_def = store.create_gateway_model_definition(
    name="claude-haiku-4-5",
    secret_id=secret.secret_id,
    provider="anthropic",
    model_name="claude-haiku-4-5",
)
endpoint = store.create_gateway_endpoint(
    name="anthropic-v1", model_definition_ids=[model_def.model_definition_id]
)

client = Anthropic(
    api_key="dummy",
    base_url="http://localhost:5000/gateway/anthropic",
)
message = client.messages.create(
    max_tokens=1024,
    messages=[{
        "content": "Hello, world",
        "role": "user",
    }],
    model="anthropic-v1",
)
print(message.content[0])

How is this PR tested?

  • Existing unit/integration tests
  • New unit/integration tests
  • Manual tests

Does this PR require documentation update?

  • No. You can skip the rest of this section.
  • Yes. I've updated:
    • Examples
    • API references
    • Instructions

Release Notes

Is this a user-facing change?

  • No. You can skip the rest of this section.
  • Yes. Give a description of this change to be included in the release notes for MLflow users.

What component(s), interfaces, languages, and integrations does this PR affect?

Components

  • area/tracking: Tracking Service, tracking client APIs, autologging
  • area/models: MLmodel format, model serialization/deserialization, flavors
  • area/model-registry: Model Registry service, APIs, and the fluent client calls for Model Registry
  • area/scoring: MLflow Model server, model deployment tools, Spark UDFs
  • area/evaluation: MLflow model evaluation features, evaluation metrics, and evaluation workflows
  • area/gateway: MLflow AI Gateway client APIs, server, and third-party integrations
  • area/prompts: MLflow prompt engineering features, prompt templates, and prompt management
  • area/tracing: MLflow Tracing features, tracing APIs, and LLM tracing functionality
  • area/projects: MLproject format, project running backends
  • area/uiux: Front-end, user experience, plotting, JavaScript, JavaScript dev server
  • area/build: Build and test infrastructure for MLflow
  • area/docs: MLflow documentation pages

How should the PR be classified in the release notes? Choose one:

  • rn/none - No description will be included. The PR will be mentioned only by the PR number in the "Small Bugfixes and Documentation Updates" section
  • rn/breaking-change - The PR will be mentioned in the "Breaking Changes" section
  • rn/feature - A new user-facing feature worth mentioning in the release notes
  • rn/bug-fix - A user-facing bug fix worth mentioning in the release notes
  • rn/documentation - A user-facing documentation change worth mentioning in the release notes

Should this PR be included in the next patch release?

Yes should be selected for bug fixes, documentation updates, and other small changes. No should be selected for new features and larger changes. If you're unsure about the release classification of this PR, leave this unchecked to let the maintainers decide.

What is a minor/patch release?
  • Minor release: a release that increments the second part of the version number (e.g., 1.2.0 -> 1.3.0).
    Bug fixes, doc updates and new features usually go into minor releases.
  • Patch release: a release that increments the third part of the version number (e.g., 1.2.0 -> 1.2.1).
    Bug fixes and doc updates usually go into patch releases.
  • Yes (this PR will be cherry-picked and included in the next patch release)
  • No (this PR will be included in the next minor release)

@TomeHirata TomeHirata marked this pull request as ready for review December 16, 2025 08:48
Copilot AI review requested due to automatic review settings December 16, 2025 08:48
@github-actions github-actions bot added the rn/feature Mention under Features in Changelogs. label Dec 16, 2025
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds native Anthropic Messages API passthrough support to the MLflow AI Gateway, enabling users to interact with Anthropic models using the official Anthropic client SDK. It's part of a stacked PR series that introduces LiteLLM provider support and passthrough endpoints for both OpenAI and Anthropic.

Key Changes:

  • Added /gateway/anthropic/v1/messages endpoint for native Anthropic client compatibility
  • Implemented passthrough methods in AnthropicProvider supporting both streaming and non-streaming responses
  • Introduced LiteLLM provider as a fallback for unsupported providers in the gateway

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
mlflow/server/gateway_api.py Added anthropic_passthrough_messages endpoint handler and helper function _extract_endpoint_name_from_model; Updated _create_provider_from_endpoint_name to use LiteLLM as fallback for unsupported providers
mlflow/gateway/providers/anthropic.py Implemented passthrough_anthropic_messages method for raw Anthropic API request/response handling with streaming support
mlflow/gateway/providers/openai.py Added three passthrough methods: passthrough_openai_chat, passthrough_openai_embeddings, and passthrough_openai_responses for raw OpenAI API compatibility
mlflow/gateway/providers/base.py Added abstract passthrough method definitions for OpenAI and Anthropic APIs in the BaseProvider class
mlflow/gateway/providers/litellm.py New provider implementation using LiteLLM library to support long-tail LLM providers with chat, streaming, and embeddings functionality
mlflow/gateway/config.py Added LiteLLMConfig class with provider, api_key, and api_base configuration options; Added LITELLM to Provider enum
mlflow/gateway/provider_registry.py Registered LiteLLMProvider in the default provider registry
tests/server/test_gateway_api.py Added comprehensive integration tests for OpenAI and Anthropic passthrough endpoints covering both streaming and non-streaming scenarios
tests/gateway/providers/test_anthropic.py Added unit tests for Anthropic passthrough messages with streaming support
tests/gateway/providers/test_openai.py Added unit tests for OpenAI passthrough endpoints including Azure OpenAI edge case
tests/gateway/providers/test_litellm.py New test file with comprehensive coverage for LiteLLM provider chat, embeddings, and streaming functionality
docs/api_reference/api_inventory.txt Added LiteLLMConfig API reference entries

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@github-actions
Copy link
Contributor

github-actions bot commented Dec 16, 2025

Documentation preview for 6676b58 is available at:

More info
  • Ignore this comment if this PR does not change the documentation.
  • The preview is updated when a new commit is pushed to this PR.
  • This comment was created by this workflow run.
  • The documentation was built by this workflow run.

Copy link
Collaborator

@B-Step62 B-Step62 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, with a suggestion to DRY.

Comment on lines +463 to +476
provider_path = self.PASSTHROUGH_PROVIDER_PATHS.get(action)
if provider_path is None:
route = PASSTHROUGH_ROUTES.get(action)
supported_routes = ", ".join(
f"/gateway{route} (provider_path: {path})"
for act in self.PASSTHROUGH_PROVIDER_PATHS.keys()
if (route := PASSTHROUGH_ROUTES.get(act))
and (path := self.PASSTHROUGH_PROVIDER_PATHS.get(act))
)
raise AIGatewayException(
status_code=400,
detail=f"Unsupported passthrough endpoint '{route}' for {self.NAME} provider. "
f"Supported endpoints: {supported_routes}",
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we move this logic to base class like def get_passthrough_provider_path? Looks generic enough.

Signed-off-by: Tomu Hirata <tomu.hirata@gmail.com>
Signed-off-by: Tomu Hirata <tomu.hirata@gmail.com>
Signed-off-by: Tomu Hirata <tomu.hirata@gmail.com>
@TomeHirata TomeHirata force-pushed the stack/gateway/model-passthrough/anthropic branch from 393d14c to d237933 Compare December 17, 2025 13:20
Signed-off-by: Tomu Hirata <tomu.hirata@gmail.com>
@TomeHirata TomeHirata enabled auto-merge December 17, 2025 13:37
@TomeHirata TomeHirata added this pull request to the merge queue Dec 17, 2025
Merged via the queue into mlflow:master with commit 12f76d0 Dec 17, 2025
68 of 70 checks passed
@TomeHirata TomeHirata deleted the stack/gateway/model-passthrough/anthropic branch December 17, 2025 16:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

rn/feature Mention under Features in Changelogs.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants