Skip to content

[Bug fix] Traces UI: Support filtering on assessments with multiple values (e.g. error and boolean)#19262

Merged
dbczumar merged 29 commits intomlflow:masterfrom
dbczumar:assessment_ui_filter_fix
Dec 14, 2025
Merged

[Bug fix] Traces UI: Support filtering on assessments with multiple values (e.g. error and boolean)#19262
dbczumar merged 29 commits intomlflow:masterfrom
dbczumar:assessment_ui_filter_fix

Conversation

@dbczumar
Copy link
Collaborator

@dbczumar dbczumar commented Dec 8, 2025

🛠 DevTools 🛠

Open in GitHub Codespaces

Install mlflow from this PR

# mlflow
pip install git+https://github.com/mlflow/mlflow.git@refs/pull/19262/merge
# mlflow-skinny
pip install git+https://github.com/mlflow/mlflow.git@refs/pull/19262/merge#subdirectory=libs/skinny

For Databricks, use the following command:

%sh curl -LsSf https://raw.githubusercontent.com/mlflow/mlflow/HEAD/dev/install-skinny.sh | sh -s pull/19262/merge

Related Issues/PRs

#xxx

What changes are proposed in this pull request?

Traces UI: Support filtering on assessments with multiple values (e.g. error and boolean)

How is this PR tested?

  • Existing unit/integration tests
  • New unit/integration tests
  • Manual tests

Before

Screen.Recording.2025-12-07.at.4.44.37.PM.mov

After

Screen.Recording.2025-12-07.at.4.43.32.PM.mov

Does this PR require documentation update?

  • No. You can skip the rest of this section.
  • Yes. I've updated:
    • Examples
    • API references
    • Instructions

Release Notes

Fixed a bug that prevented assessments from being used in search filters if they contained errors

Is this a user-facing change?

  • No. You can skip the rest of this section.
  • Yes. Give a description of this change to be included in the release notes for MLflow users.

What component(s), interfaces, languages, and integrations does this PR affect?

Components

  • area/tracking: Tracking Service, tracking client APIs, autologging
  • area/models: MLmodel format, model serialization/deserialization, flavors
  • area/model-registry: Model Registry service, APIs, and the fluent client calls for Model Registry
  • area/scoring: MLflow Model server, model deployment tools, Spark UDFs
  • area/evaluation: MLflow model evaluation features, evaluation metrics, and evaluation workflows
  • area/gateway: MLflow AI Gateway client APIs, server, and third-party integrations
  • area/prompts: MLflow prompt engineering features, prompt templates, and prompt management
  • area/tracing: MLflow Tracing features, tracing APIs, and LLM tracing functionality
  • area/projects: MLproject format, project running backends
  • area/uiux: Front-end, user experience, plotting, JavaScript, JavaScript dev server
  • area/build: Build and test infrastructure for MLflow
  • area/docs: MLflow documentation pages

How should the PR be classified in the release notes? Choose one:

  • rn/none - No description will be included. The PR will be mentioned only by the PR number in the "Small Bugfixes and Documentation Updates" section
  • rn/breaking-change - The PR will be mentioned in the "Breaking Changes" section
  • rn/feature - A new user-facing feature worth mentioning in the release notes
  • rn/bug-fix - A user-facing bug fix worth mentioning in the release notes
  • rn/documentation - A user-facing documentation change worth mentioning in the release notes

Should this PR be included in the next patch release?

Yes should be selected for bug fixes, documentation updates, and other small changes. No should be selected for new features and larger changes. If you're unsure about the release classification of this PR, leave this unchecked to let the maintainers decide.

What is a minor/patch release?
  • Minor release: a release that increments the second part of the version number (e.g., 1.2.0 -> 1.3.0).
    Bug fixes, doc updates and new features usually go into minor releases.
  • Patch release: a release that increments the third part of the version number (e.g., 1.2.0 -> 1.2.1).
    Bug fixes and doc updates usually go into patch releases.
  • Yes (this PR will be cherry-picked and included in the next patch release)
  • No (this PR will be included in the next minor release)

Signed-off-by: dbczumar <corey.zumar@databricks.com>
Signed-off-by: dbczumar <corey.zumar@databricks.com>
Signed-off-by: dbczumar <corey.zumar@databricks.com>
Signed-off-by: dbczumar <corey.zumar@databricks.com>
Signed-off-by: dbczumar <corey.zumar@databricks.com>
Signed-off-by: dbczumar <corey.zumar@databricks.com>
Signed-off-by: dbczumar <corey.zumar@databricks.com>
Signed-off-by: dbczumar <corey.zumar@databricks.com>
Signed-off-by: dbczumar <corey.zumar@databricks.com>
Signed-off-by: dbczumar <corey.zumar@databricks.com>
Signed-off-by: dbczumar <corey.zumar@databricks.com>
Signed-off-by: dbczumar <corey.zumar@databricks.com>
Copilot AI review requested due to automatic review settings December 8, 2025 00:37
@github-actions github-actions bot added area/uiux Front-end, user experience, plotting, JavaScript, JavaScript dev server v3.7.1 rn/bug-fix Mention under Bug Fixes in Changelogs. labels Dec 8, 2025
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a bug where assessments containing errors could not be used in search filters. The fix enables backend filtering for assessments that have multiple values (some with errors, some without).

  • Changed assessment dtype determination to iterate through all assessments instead of just checking the first one
  • Disabled client-side filtering for assessments in OSS MLflow, moving filtering to the backend
  • Updated tests to verify backend filtering behavior

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File Description
AggregationUtils.ts Modified dtype determination and unique value collection to iterate through all assessments, properly handling cases where some assessments have errors and others have values
useMlflowTraces.tsx Disabled client-side filtering for assessments in OSS MLflow by hardcoding useClientSideFiltering to false, ensuring all filters including assessments are sent to the backend
useMlflowTraces.test.tsx Updated tests to verify assessment filters are sent to backend instead of being filtered client-side, and added separate test for search query filtering

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Signed-off-by: dbczumar <corey.zumar@databricks.com>
...retrievalAssessmentsByName,
]) {
assessmentNames.add(assessmentName);
const assessment = assessments[0];
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously, because we only looked at the first assessment when computing the available values for filtering, if the first value logged for an assessment as an error and the second value was something else (e.g. true / false) due to a retry, the UI wouldn't allow you to filter by the second value.

(See PR description for before / after)

Signed-off-by: dbczumar <corey.zumar@databricks.com>
Signed-off-by: dbczumar <corey.zumar@databricks.com>
Signed-off-by: dbczumar <corey.zumar@databricks.com>
Comment on lines +274 to +275
// Client-side filtering is always disabled in OSS MLflow. It is only used in Databricks.
const useClientSideFiltering = false;
Copy link
Collaborator Author

@dbczumar dbczumar Dec 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the OSS backend now supports assessment filtering, we don't need client side filtering in OSS anymore.
Since we still require client side filtering on Databricks, I kept the useClientSideFiltering logic and set the value to false in OSS (we can set a different value on Databricks in an edge block).

Comment on lines +34 to +35
? runValue.overallAssessments ?? []
: runValue.responseAssessmentsByName[assessmentName] ?? [];
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For client-side filtering logic (still applicable on Databricks), we should be looking at all values of an assessment when applying a filter, not just the first.

@dbczumar dbczumar requested a review from daniellok-db December 8, 2025 00:51
@dbczumar dbczumar changed the title Traces UI: Support filtering on assessments with multiple values (e.g. error and boolean) [Bug fix] Traces UI: Support filtering on assessments with multiple values (e.g. error and boolean) Dec 8, 2025
Signed-off-by: dbczumar <corey.zumar@databricks.com>
Signed-off-by: dbczumar <corey.zumar@databricks.com>
@github-actions
Copy link
Contributor

github-actions bot commented Dec 8, 2025

Documentation preview for a7ad0a4 is available at:

More info
  • Ignore this comment if this PR does not change the documentation.
  • The preview is updated when a new commit is pushed to this PR.
  • This comment was created by this workflow run.
  • The documentation was built by this workflow run.

Copy link
Collaborator

@daniellok-db daniellok-db left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lg, but looks like some integration test needs to be fixed

Signed-off-by: dbczumar <corey.zumar@databricks.com>
Signed-off-by: dbczumar <corey.zumar@databricks.com>
Signed-off-by: dbczumar <corey.zumar@databricks.com>
Signed-off-by: dbczumar <corey.zumar@databricks.com>
Signed-off-by: dbczumar <corey.zumar@databricks.com>
Signed-off-by: dbczumar <corey.zumar@databricks.com>
Signed-off-by: dbczumar <corey.zumar@databricks.com>
The filter dropdown for assessments now shows the Error option when
assessmentInfo.containsErrors is true, similar to how bar charts
already handle it.

Signed-off-by: dbczumar <corey.zumar@databricks.com>
getAssessmentValueLabel now handles error value at the start of the
function, before checking dtype. Previously, for boolean dtype
assessments, ERROR_KEY would fall through to the else branch and
render as "null" instead of "Error".

Uses a local constant to avoid circular dependency with AggregationUtils.

Signed-off-by: dbczumar <corey.zumar@databricks.com>
Tests verify that ERROR_VALUE ('Error') is handled correctly regardless
of the assessment dtype (boolean, pass-fail, etc).

Signed-off-by: dbczumar <corey.zumar@databricks.com>
Signed-off-by: dbczumar <corey.zumar@databricks.com>
@dbczumar dbczumar enabled auto-merge December 14, 2025 00:12
@dbczumar dbczumar added this pull request to the merge queue Dec 14, 2025
Merged via the queue into mlflow:master with commit 44736a5 Dec 14, 2025
50 checks passed
@dbczumar dbczumar deleted the assessment_ui_filter_fix branch December 14, 2025 00:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/uiux Front-end, user experience, plotting, JavaScript, JavaScript dev server rn/bug-fix Mention under Bug Fixes in Changelogs. v3.7.1 v3.8.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants