Skip to content

Improve scorer trace picker UX and validation#20178

Merged
danielseong1 merged 1 commit intomlflow:masterfrom
danielseong1:scorers-bugs
Jan 22, 2026
Merged

Improve scorer trace picker UX and validation#20178
danielseong1 merged 1 commit intomlflow:masterfrom
danielseong1:scorers-bugs

Conversation

@danielseong1
Copy link
Collaborator

@danielseong1 danielseong1 commented Jan 21, 2026

Related Issues/PRs

#xxx

What changes are proposed in this pull request?

This PR makes two adjustments to the scorer evaluation UI based on UX feedback:

  1. Remove "Last trace/session" dropdown option - Users must now explicitly select traces/sessions via the modal picker. This simplifies the UI and makes trace selection more intentional.

  2. Disable "Run judge" button for non-gateway models - Direct model endpoints are not supported for running judges from the UI, so the button is now disabled with an explanatory tooltip.

Key changes:

  • Simplified itemsToEvaluate state from { itemCount, itemIds } to just selectedItemIds: string[]
  • Replaced dropdown with a simple button that opens the selection modal
  • Added validation to disable run button when no traces are selected
  • Added validation to disable run button for direct (non-gateway) models

How is this PR tested?

  • Existing unit/integration tests
  • New unit/integration tests
  • Manual tests

Updated existing unit tests to reflect the simplified state shape. Manually verified the UI changes work correctly.

Does this PR require documentation update?

  • No. You can skip the rest of this section.
  • Yes. I've updated:
    • Examples
    • API references
    • Instructions

Release Notes

Is this a user-facing change?

  • No. You can skip the rest of this section.
  • Yes. Give a description of this change to be included in the release notes for MLflow users.

What component(s), interfaces, languages, and integrations does this PR affect?

Components

  • area/tracking: Tracking Service, tracking client APIs, autologging
  • area/models: MLmodel format, model serialization/deserialization, flavors
  • area/model-registry: Model Registry service, APIs, and the fluent client calls for Model Registry
  • area/scoring: MLflow Model server, model deployment tools, Spark UDFs
  • area/evaluation: MLflow model evaluation features, evaluation metrics, and evaluation workflows
  • area/gateway: MLflow AI Gateway client APIs, server, and third-party integrations
  • area/prompts: MLflow prompt engineering features, prompt templates, and prompt management
  • area/tracing: MLflow Tracing features, tracing APIs, and LLM tracing functionality
  • area/projects: MLproject format, project running backends
  • area/uiux: Front-end, user experience, plotting, JavaScript, JavaScript dev server
  • area/build: Build and test infrastructure for MLflow
  • area/docs: MLflow documentation pages

How should the PR be classified in the release notes? Choose one:

  • rn/none - No description will be included. The PR will be mentioned only by the PR number in the "Small Bugfixes and Documentation Updates" section
  • rn/breaking-change - The PR will be mentioned in the "Breaking Changes" section
  • rn/feature - A new user-facing feature worth mentioning in the release notes
  • rn/bug-fix - A user-facing bug fix worth mentioning in the release notes
  • rn/documentation - A user-facing documentation change worth mentioning in the release notes

Should this PR be included in the next patch release?

  • Yes (this PR will be cherry-picked and included in the next patch release)
  • No (this PR will be included in the next minor release)

🤖 Generated with Claude Code

@github-actions
Copy link
Contributor

🛠 DevTools 🛠

Install mlflow from this PR

# mlflow
pip install git+https://github.com/mlflow/mlflow.git@refs/pull/20178/merge
# mlflow-skinny
pip install git+https://github.com/mlflow/mlflow.git@refs/pull/20178/merge#subdirectory=libs/skinny

For Databricks, use the following command:

%sh curl -LsSf https://raw.githubusercontent.com/mlflow/mlflow/HEAD/dev/install-skinny.sh | sh -s pull/20178/merge

@github-actions github-actions bot added area/tracing MLflow Tracing and its integrations area/uiux Front-end, user experience, plotting, JavaScript, JavaScript dev server rn/feature Mention under Features in Changelogs. labels Jan 21, 2026
@github-actions
Copy link
Contributor

github-actions bot commented Jan 21, 2026

Documentation preview for 31bdb7e is available at:

More info
  • Ignore this comment if this PR does not change the documentation.
  • The preview is updated when a new commit is pushed to this PR.
  • This comment was created by this workflow run.
  • The documentation was built by this workflow run.

@github-actions github-actions bot added area/tracing MLflow Tracing and its integrations and removed area/tracing MLflow Tracing and its integrations labels Jan 21, 2026
This PR makes two improvements to the scorer evaluation UI:

1. Remove "Last trace/session" dropdown option - Users must now explicitly
   select traces/sessions via the modal picker. This simplifies the UI and
   makes trace selection more intentional.

2. Disable "Run judge" button for non-gateway models - Direct model endpoints
   are not supported for running judges from the UI, so the button is now
   disabled with an explanatory tooltip.

Key changes:
- Simplified `itemsToEvaluate` state from `{ itemCount, itemIds }` to just `selectedItemIds: string[]`
- Replaced dropdown with a simple button that opens the selection modal
- Added validation to disable run button when no traces are selected
- Added validation to disable run button for direct (non-gateway) models

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Daniel Seong <daniel.leem.seong@gmail.com>
@github-actions github-actions bot added rn/bug-fix Mention under Bug Fixes in Changelogs. and removed rn/feature Mention under Features in Changelogs. area/tracing MLflow Tracing and its integrations labels Jan 21, 2026
Copy link
Collaborator

@smoorjani smoorjani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the fixes!

@danielseong1 danielseong1 added this pull request to the merge queue Jan 22, 2026
Merged via the queue into mlflow:master with commit 4737370 Jan 22, 2026
55 of 57 checks passed
@danielseong1 danielseong1 deleted the scorers-bugs branch January 22, 2026 01:00
harupy pushed a commit to harupy/mlflow that referenced this pull request Jan 28, 2026
Signed-off-by: Daniel Seong <daniel.leem.seong@gmail.com>
Co-authored-by: Daniel Seong <daniel.leem.seong@gmail.com>
Co-authored-by: Claude <noreply@anthropic.com>
harupy pushed a commit to harupy/mlflow that referenced this pull request Jan 28, 2026
Signed-off-by: Daniel Seong <daniel.leem.seong@gmail.com>
Co-authored-by: Daniel Seong <daniel.leem.seong@gmail.com>
Co-authored-by: Claude <noreply@anthropic.com>
harupy pushed a commit that referenced this pull request Jan 28, 2026
Signed-off-by: Daniel Seong <daniel.leem.seong@gmail.com>
Co-authored-by: Daniel Seong <daniel.leem.seong@gmail.com>
Co-authored-by: Claude <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/uiux Front-end, user experience, plotting, JavaScript, JavaScript dev server rn/bug-fix Mention under Bug Fixes in Changelogs. v3.9.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants