link gateway and experiment by TomeHirata · Pull Request #20356 · mlflow/mlflow

TomeHirata · 2026-01-27T09:46:26Z

🥞 Stacked PR

Use this link to review incremental changes.

stack/gateway-backend-trace-integration [Files changed]
- stack/gateway-backend-ui [Files changed]
  - stack/gateway-trace-ingestion [Files changed]
    - stack/gateway-trace-api [Files changed]
      - stack/gateway-frontend-usage-ui [Files changed]

Related Issues/PRs

n/a

What changes are proposed in this pull request?

Link gateway endpoint and experiment id. We auto generate an experiment when usage tracking is on and the id is not specified.

How is this PR tested?

Existing unit/integration tests
New unit/integration tests
Manual tests

Does this PR require documentation update?

Release Notes

Is this a user-facing change?

No. You can skip the rest of this section.
Yes. Give a description of this change to be included in the release notes for MLflow users.

What component(s), interfaces, languages, and integrations does this PR affect?

Components

How should the PR be classified in the release notes? Choose one:

rn/none - No description will be included. The PR will be mentioned only by the PR number in the "Small Bugfixes and Documentation Updates" section
rn/breaking-change - The PR will be mentioned in the "Breaking Changes" section
rn/feature - A new user-facing feature worth mentioning in the release notes
rn/bug-fix - A user-facing bug fix worth mentioning in the release notes
rn/documentation - A user-facing documentation change worth mentioning in the release notes

Should this PR be included in the next patch release?

Yes should be selected for bug fixes, documentation updates, and other small changes. No should be selected for new features and larger changes. If you're unsure about the release classification of this PR, leave this unchecked to let the maintainers decide.

What is a minor/patch release?

Minor release: a release that increments the second part of the version number (e.g., 1.2.0 -> 1.3.0).
Bug fixes, doc updates and new features usually go into minor releases.
Patch release: a release that increments the third part of the version number (e.g., 1.2.0 -> 1.2.1).
Bug fixes and doc updates usually go into patch releases.

Yes (this PR will be cherry-picked and included in the next patch release)
No (this PR will be included in the next minor release)

github-actions · 2026-01-27T09:46:37Z

🛠 DevTools 🛠

Install mlflow from this PR

# mlflow
pip install git+https://github.com/mlflow/mlflow.git@refs/pull/20356/merge
# mlflow-skinny
pip install git+https://github.com/mlflow/mlflow.git@refs/pull/20356/merge#subdirectory=libs/skinny

For Databricks, use the following command:

%sh curl -LsSf https://raw.githubusercontent.com/mlflow/mlflow/HEAD/dev/install-skinny.sh | sh -s pull/20356/merge

Copilot

Pull request overview

This PR links MLflow AI Gateway endpoints to MLflow Tracing by attaching an experiment_id to each endpoint, automatically creating experiments when needed, instrumenting gateway provider calls with tracing spans, and exposing trace navigation from the UI.

Changes:

Extend gateway endpoint schema (proto, entities, DB models, REST/SQL/abstract mixins, JS types) with an optional experiment_id, auto-creating an experiment per endpoint (gateway/{name}) when none is provided.
Add tracing instrumentation to gateway providers and HTTP handlers: wrap providers in TracingProviderWrapper, create top-level gateway traces per invocation, and propagate token usage and provider/model metadata into span attributes.
Update tests and UI to accommodate tracing: adapt provider-type tests to the tracing wrapper, ensure chat completions validation works with real endpoints, expose experiment_id in the React types and forms, and add a “View traces” link on the endpoint edit page when tracing is configured.

Reviewed changes

Copilot reviewed 16 out of 18 changed files in this pull request and generated no comments.

Show a summary per file

File	Description
`tests/server/test_gateway_api.py`	Updates provider construction tests to account for `TracingProviderWrapper` and adds setup for chat-completions validation that now requires an actual endpoint.
`mlflow/store/tracking/gateway/sqlalchemy_mixin.py`	Adds `experiment_id` argument to `create_gateway_endpoint` / `update_gateway_endpoint`, with auto-creation of a `gateway/{name}` experiment and persistence into `SqlGatewayEndpoint`.
`mlflow/store/tracking/gateway/rest_mixin.py`	Wires `experiment_id` through REST client methods for creating and updating gateway endpoints, matching the extended proto.
`mlflow/store/tracking/gateway/abstract_mixin.py`	Extends the abstract gateway store interface to include `experiment_id` on create/update endpoint signatures and documents its tracing semantics.
`mlflow/store/tracking/dbmodels/models.py`	Adds an `experiment_id` column to `SqlGatewayEndpoint` and returns it via `to_mlflow_entity()` so Python entities see it.
`mlflow/store/db_migrations/versions/d0e1f2a3b4c5_add_experiment_id_to_endpoints.py`	Introduces an Alembic migration to add/drop the nullable `experiment_id` column on the `endpoints` table.
`mlflow/server/js/src/gateway/types.ts`	Extends TS types for endpoints and create/update payloads with optional `experiment_id` to keep the UI/client in sync with the backend.
`mlflow/server/js/src/gateway/pages/EndpointPage.tsx`	Passes the endpoint’s `experiment_id` into the edit form renderer so the UI can show trace links.
`mlflow/server/js/src/gateway/hooks/useCreateEndpointForm.ts`	Adds `experimentId` form state and includes it as `experiment_id` in the create-endpoint mutation, defaulting to auto-create when left blank.
`mlflow/server/js/src/gateway/components/endpoint-form/EndpointFormRenderer.tsx`	Adds an “Experiment” section in create mode for optional experiment ID input, with helper text explaining auto-creation behavior.
`mlflow/server/js/src/gateway/components/edit-endpoint/EditEndpointFormRenderer.tsx`	Adds a “Usage log” block linking to `/experiments/{experimentId}/traces` when the endpoint has an associated experiment.
`mlflow/server/gateway_api.py`	Imports tracing types, adds helpers for creating gateway traces and extracting provider/model info, wraps created providers with `TracingProviderWrapper`, and instruments all gateway/chat/passthrough routes to create traces and set outputs/token-usage where possible.
`mlflow/protos/service_pb2.pyi`	Updates Python type stubs to add `experiment_id` fields/slots/ctor args on `GatewayEndpoint`, `CreateGatewayEndpoint`, and `UpdateGatewayEndpoint`.
`mlflow/protos/service.proto`	Extends gateway endpoint and create/update RPC messages with an optional `experiment_id` field and associated comments.
`mlflow/gateway/providers/base.py`	Enhances `FallbackProvider` with internal per-attempt tracing spans and introduces `TracingProviderWrapper` to add spans around all provider methods (chat, embeddings, completions, passthrough, streaming).
`mlflow/entities/gateway_endpoint.py`	Adds `experiment_id` to the `GatewayEndpoint` entity and ensures it is serialized/deserialized to/from the extended proto.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

github-actions · 2026-01-27T10:42:02Z

Documentation preview for cd4753d is available at:

https://pr-20356--mlflow-docs-preview.netlify.app/docs/latest/

More info

Ignore this comment if this PR does not change the documentation.
The preview is updated when a new commit is pushed to this PR.
This comment was created by this workflow run.
The documentation was built by this workflow run.

mlflow/store/tracking/gateway/sqlalchemy_mixin.py

tests/store/tracking/test_rest_store.py

mlflow/server/js/src/gateway/components/create-endpoint/ExperimentSelect.tsx

serena-ruan · 2026-02-02T02:43:55Z

mlflow/server/handlers.py

+    # Auto-create experiment if usage_tracking is enabled and experiment_id not provided
+    if usage_tracking and experiment_id is None:
+        store = _get_tracking_store()
+        experiment_name = f"gateway/{request_message.name}"
+        experiment_id = _get_or_create_experiment_id(store, experiment_name)


What's the reason we move the implementation from store to handler?

Handler is a higher level logic layer which can orchestrate the logic across multiple resources (experiment, gateway). I actually think the current separation of concern is better api design pattern than putting the experiment creation in create_gateway_endpoint, why do you want to move back to the sql store method?

Because generally we put implementation details inside store method so each store can have its own logic, while in this case it only works for sqlstore and it's on-purpose to create the experiment so there's no big difference. However, if the backend is FileStore this will create the experiment even if create_gateway_endpoint is not supported by it?

However, if the backend is FileStore this will create the experiment even if create_gateway_endpoint is not supported by it?

Thanks, even though we could add the explicit validation in handler, I think this is a good reason to give up the completeness of the abstraction a bit. Move back the experiment creation to the sql store.

TomeHirata · 2026-02-02T05:51:53Z

/autoformat

Add experiment_id column to endpoints table to link Gateway endpoints with MLflow experiments. This enables usage tracking and filtering of Gateway metrics by experiment. Also add a boolean usage_tracking field that controls whether trace ingestion is enabled: - When usage_tracking is True, an experiment is auto-created if not provided - When usage_tracking is False, no experiment is created Changes: - Add migration for experiment_id and usage_tracking columns - Update entity and model classes with new fields - Add fields to proto definitions - Update server handlers for new parameters - Update store mixins for handling Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Tomu Hirata <tomu.hirata@gmail.com>

Signed-off-by: Tomu Hirata <tomu.hirata@gmail.com>

mlflow/store/tracking/gateway/sqlalchemy_mixin.py

serena-ruan

LGTM! Thanks for iterating on my comments!

Signed-off-by: Tomu Hirata <tomu.hirata@gmail.com>

mlflow/store/tracking/dbmodels/models.py

Signed-off-by: Tomu Hirata <tomu.hirata@gmail.com>

Copilot AI review requested due to automatic review settings January 27, 2026 09:46

Copilot started reviewing on behalf of TomeHirata January 27, 2026 09:46 View session

TomeHirata mentioned this pull request Jan 27, 2026

Add usage section in endpoint page #20357

Merged

29 tasks

Copilot AI reviewed Jan 27, 2026

View reviewed changes

TomeHirata force-pushed the stack/gateway-backend-trace-integration branch from 090aad9 to 3bd4593 Compare January 27, 2026 10:11

TomeHirata mentioned this pull request Jan 27, 2026

Add trace ingestion for Gateway endpoints #20358

Merged

29 tasks

TomeHirata force-pushed the stack/gateway-backend-trace-integration branch 2 times, most recently from 49c039b to 55ae64a Compare January 27, 2026 10:32

TomeHirata requested a review from B-Step62 January 27, 2026 10:33

TomeHirata changed the title ~~link gateway and trace~~ link gateway and experiment Jan 27, 2026

github-actions bot added the rn/feature Mention under Features in Changelogs. label Jan 27, 2026

TomeHirata added the team-review Trigger a team review request label Jan 28, 2026

github-actions bot requested review from daniellok-db and harupy January 28, 2026 02:35

TomeHirata force-pushed the stack/gateway-backend-trace-integration branch 2 times, most recently from b9b3cd4 to f9d8fe5 Compare January 29, 2026 06:28

TomeHirata requested a review from serena-ruan January 29, 2026 08:15