Skip to content

Store gateway<>scorer binding correctly#20176

Merged
TomeHirata merged 9 commits intomlflow:masterfrom
TomeHirata:fix/scorer/gateway-binding
Jan 23, 2026
Merged

Store gateway<>scorer binding correctly#20176
TomeHirata merged 9 commits intomlflow:masterfrom
TomeHirata:fix/scorer/gateway-binding

Conversation

@TomeHirata
Copy link
Collaborator

@TomeHirata TomeHirata commented Jan 21, 2026

Related Issues/PRs

n/a

What changes are proposed in this pull request?

This PR makes sure that the binding record is created between gateway and scorer when a gateway endpoint is used by a registered scorer.

image image

How is this PR tested?

  • Existing unit/integration tests
  • New unit/integration tests
  • Manual tests

Does this PR require documentation update?

  • No. You can skip the rest of this section.
  • Yes. I've updated:
    • Examples
    • API references
    • Instructions

Release Notes

Is this a user-facing change?

  • No. You can skip the rest of this section.
  • Yes. Give a description of this change to be included in the release notes for MLflow users.

What component(s), interfaces, languages, and integrations does this PR affect?

Components

  • area/tracking: Tracking Service, tracking client APIs, autologging
  • area/models: MLmodel format, model serialization/deserialization, flavors
  • area/model-registry: Model Registry service, APIs, and the fluent client calls for Model Registry
  • area/scoring: MLflow Model server, model deployment tools, Spark UDFs
  • area/evaluation: MLflow model evaluation features, evaluation metrics, and evaluation workflows
  • area/gateway: MLflow AI Gateway client APIs, server, and third-party integrations
  • area/prompts: MLflow prompt engineering features, prompt templates, and prompt management
  • area/tracing: MLflow Tracing features, tracing APIs, and LLM tracing functionality
  • area/projects: MLproject format, project running backends
  • area/uiux: Front-end, user experience, plotting, JavaScript, JavaScript dev server
  • area/build: Build and test infrastructure for MLflow
  • area/docs: MLflow documentation pages

How should the PR be classified in the release notes? Choose one:

  • rn/none - No description will be included. The PR will be mentioned only by the PR number in the "Small Bugfixes and Documentation Updates" section
  • rn/breaking-change - The PR will be mentioned in the "Breaking Changes" section
  • rn/feature - A new user-facing feature worth mentioning in the release notes
  • rn/bug-fix - A user-facing bug fix worth mentioning in the release notes
  • rn/documentation - A user-facing documentation change worth mentioning in the release notes

Should this PR be included in the next patch release?

Yes should be selected for bug fixes, documentation updates, and other small changes. No should be selected for new features and larger changes. If you're unsure about the release classification of this PR, leave this unchecked to let the maintainers decide.

What is a minor/patch release?
  • Minor release: a release that increments the second part of the version number (e.g., 1.2.0 -> 1.3.0).
    Bug fixes, doc updates and new features usually go into minor releases.
  • Patch release: a release that increments the third part of the version number (e.g., 1.2.0 -> 1.2.1).
    Bug fixes and doc updates usually go into patch releases.
  • Yes (this PR will be cherry-picked and included in the next patch release)
  • No (this PR will be included in the next minor release)

Signed-off-by: Tomu Hirata <tomu.hirata@gmail.com>
Copilot AI review requested due to automatic review settings January 21, 2026 06:20
@TomeHirata TomeHirata added the team-review Trigger a team review request label Jan 21, 2026
@TomeHirata TomeHirata requested a review from BenWilson2 January 21, 2026 06:20
@github-actions
Copy link
Contributor

🛠 DevTools 🛠

Install mlflow from this PR

# mlflow
pip install git+https://github.com/mlflow/mlflow.git@refs/pull/20176/merge
# mlflow-skinny
pip install git+https://github.com/mlflow/mlflow.git@refs/pull/20176/merge#subdirectory=libs/skinny

For Databricks, use the following command:

%sh curl -LsSf https://raw.githubusercontent.com/mlflow/mlflow/HEAD/dev/install-skinny.sh | sh -s pull/20176/merge

@github-actions github-actions bot added area/uiux Front-end, user experience, plotting, JavaScript, JavaScript dev server rn/bug-fix Mention under Bug Fixes in Changelogs. labels Jan 21, 2026
class GatewayResourceType(str, Enum):
"""Valid MLflow resource types that can use gateway endpoints."""

SCORER_JOB = "scorer_job"
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renamed scorer_job to scorer since endpoint id is linked to a registered scorer instead of each scorer execution job

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR corrects the resource type terminology for gateway-scorer bindings from "scorer_job" to "scorer" to better reflect the actual relationship being tracked. The change updates the enum value, tests, documentation, and UI labels across Python, TypeScript/JavaScript, and Java codebases. Additionally, new integration tests verify that endpoint bindings are correctly created when scorers are registered with gateway endpoints and properly cleaned up when scorers are deleted.

Changes:

  • Updated GatewayResourceType enum from SCORER_JOB = "scorer_job" to SCORER = "scorer"
  • Added endpoint binding creation logic in register_scorer() and cleanup logic in delete_scorer()
  • Updated all tests and UI components to use the new "scorer" resource type terminology
  • Added comprehensive integration tests for endpoint binding lifecycle

Reviewed changes

Copilot reviewed 20 out of 20 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
mlflow/entities/gateway_endpoint.py Changed enum value from SCORER_JOB to SCORER
mlflow/store/tracking/sqlalchemy_store.py Added binding creation/deletion logic and imported GatewayResourceType
mlflow/store/tracking/gateway/*.py Updated documentation examples from "scorer_job" to "scorer"
tests/genai/scorers/test_scorer_CRUD.py Added new integration tests for endpoint binding creation and deletion
tests/tracking/test_rest_tracking.py Updated all test assertions to use new resource type
tests/store/tracking/test_rest_store.py Updated mock responses and test data to use "scorer"
tests/store/tracking/test_gateway_sql_store.py Updated all binding tests to use "scorer"
tests/entities/test_gateway_endpoint.py Updated binding tests and enum assertions
mlflow/server/js/src/gateway/types.ts Changed ResourceType from 'scorer_job' to 'scorer'
mlflow/server/js/src/gateway/components/*.tsx Updated UI labels from "Scorer Job" to "Scorer"
mlflow/server/js/src/gateway/hooks/*.test.tsx Updated test fixtures to use 'scorer'
mlflow/protos/service.proto Updated documentation comments to use "scorer" example
mlflow/java/client/src/main/java/org/mlflow/api/proto/Service.java Updated generated Java code with new documentation
mlflow/store/db_migrations/alembic.ini Changed sqlalchemy.url from empty to hardcoded SQLite path

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +2183 to +2205
# Verify the endpoint exists in the database before creating binding
if endpoint_id is not None:
endpoint_exists = (
session.query(SqlGatewayEndpoint)
.filter(SqlGatewayEndpoint.endpoint_id == endpoint_id)
.first()
)
if endpoint_exists is not None:
# Delete any existing binding for this scorer (in case of re-registration)
session.query(SqlGatewayEndpointBinding).filter(
SqlGatewayEndpointBinding.resource_type == GatewayResourceType.SCORER.value,
SqlGatewayEndpointBinding.resource_id == name,
).delete()

binding = SqlGatewayEndpointBinding(
endpoint_id=endpoint_id,
resource_type=GatewayResourceType.SCORER.value,
resource_id=name,
created_at=get_current_time_millis(),
last_updated_at=get_current_time_millis(),
)
session.add(binding)

Copy link

Copilot AI Jan 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The endpoint existence check on line 2185-2189 is redundant since the endpoint was already retrieved on line 2134 via get_gateway_endpoint, which would have raised an exception if the endpoint didn't exist. This additional database query is unnecessary and could be removed to improve performance. If the intent is to handle the case where the endpoint was deleted between line 2134 and this point, the session transaction would prevent that scenario.

Suggested change
# Verify the endpoint exists in the database before creating binding
if endpoint_id is not None:
endpoint_exists = (
session.query(SqlGatewayEndpoint)
.filter(SqlGatewayEndpoint.endpoint_id == endpoint_id)
.first()
)
if endpoint_exists is not None:
# Delete any existing binding for this scorer (in case of re-registration)
session.query(SqlGatewayEndpointBinding).filter(
SqlGatewayEndpointBinding.resource_type == GatewayResourceType.SCORER.value,
SqlGatewayEndpointBinding.resource_id == name,
).delete()
binding = SqlGatewayEndpointBinding(
endpoint_id=endpoint_id,
resource_type=GatewayResourceType.SCORER.value,
resource_id=name,
created_at=get_current_time_millis(),
last_updated_at=get_current_time_millis(),
)
session.add(binding)
if endpoint_id is not None:
# Delete any existing binding for this scorer (in case of re-registration)
session.query(SqlGatewayEndpointBinding).filter(
SqlGatewayEndpointBinding.resource_type == GatewayResourceType.SCORER.value,
SqlGatewayEndpointBinding.resource_id == name,
).delete()
binding = SqlGatewayEndpointBinding(
endpoint_id=endpoint_id,
resource_type=GatewayResourceType.SCORER.value,
resource_id=name,
created_at=get_current_time_millis(),
last_updated_at=get_current_time_millis(),
)
session.add(binding)

Copilot uses AI. Check for mistakes.
@github-actions
Copy link
Contributor

github-actions bot commented Jan 21, 2026

Documentation preview for c84b2ab is available at:

More info
  • Ignore this comment if this PR does not change the documentation.
  • The preview is updated when a new commit is pushed to this PR.
  • This comment was created by this workflow run.
  • The documentation was built by this workflow run.

Signed-off-by: Tomu Hirata <tomu.hirata@gmail.com>
@TomeHirata TomeHirata force-pushed the fix/scorer/gateway-binding branch from 006958f to 25d640d Compare January 21, 2026 06:50
- Use scorer_id instead of scorer name as resource_id for endpoint bindings
  to ensure globally unique identification across experiments
- Add display_name field to endpoint bindings for showing human-readable
  scorer names in the UI (instead of UUIDs)
- Update frontend to display display_name when available, falling back to
  resource_id for backwards compatibility

🤖 Generated with Claude Code

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Tomu Hirata <tomu.hirata@gmail.com>
Comment on lines +2190 to +2196
if endpoint_exists is not None:
# Delete any existing binding for this scorer (in case of re-registration)
# Use scorer_id for globally unique identification across experiments
session.query(SqlGatewayEndpointBinding).filter(
SqlGatewayEndpointBinding.resource_type == GatewayResourceType.SCORER.value,
SqlGatewayEndpointBinding.resource_id == scorer.scorer_id,
).delete()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this? The scorer can only be registered once right?

Copy link
Collaborator Author

@TomeHirata TomeHirata Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Scorers can have multiple versions and this is needed to clean up the binding of older versions.

Copy link
Collaborator

@serena-ruan serena-ruan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TomeHirata and others added 6 commits January 22, 2026 18:47
- Use scorer_id instead of scorer name as resource_id for endpoint bindings
  to ensure globally unique identification across experiments
- Add display_name field to endpoint bindings for showing human-readable
  scorer names in the UI (instead of UUIDs)
- Update frontend to display display_name when available, falling back to
  resource_id for backwards compatibility

🤖 Generated with Claude Code

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Tomu Hirata <tomu.hirata@gmail.com>
Signed-off-by: Tomu Hirata <tomu.hirata@gmail.com>
Signed-off-by: Tomu Hirata <tomu.hirata@gmail.com>
@TomeHirata TomeHirata enabled auto-merge January 22, 2026 10:11
@TomeHirata TomeHirata disabled auto-merge January 23, 2026 00:31
@TomeHirata TomeHirata merged commit eb0c047 into mlflow:master Jan 23, 2026
54 of 56 checks passed
harupy pushed a commit to harupy/mlflow that referenced this pull request Jan 28, 2026
Signed-off-by: Tomu Hirata <tomu.hirata@gmail.com>
Co-authored-by: Claude <noreply@anthropic.com>
harupy pushed a commit to harupy/mlflow that referenced this pull request Jan 28, 2026
Signed-off-by: Tomu Hirata <tomu.hirata@gmail.com>
Co-authored-by: Claude <noreply@anthropic.com>
harupy pushed a commit that referenced this pull request Jan 28, 2026
Signed-off-by: Tomu Hirata <tomu.hirata@gmail.com>
Co-authored-by: Claude <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/uiux Front-end, user experience, plotting, JavaScript, JavaScript dev server rn/bug-fix Mention under Bug Fixes in Changelogs. team-review Trigger a team review request v3.9.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants