Skip to content

Fix MultiIndex column search crash in dataset schema table#19461

Merged
harupy merged 7 commits intomasterfrom
copilot/fix-search-bug-multiindex
Dec 18, 2025
Merged

Fix MultiIndex column search crash in dataset schema table#19461
harupy merged 7 commits intomasterfrom
copilot/fix-search-bug-multiindex

Conversation

Copy link
Contributor

Copilot AI commented Dec 17, 2025

Related Issues/PRs

Fixes #19460

What changes are proposed in this pull request?

Searching in the dataset schema table threw TypeError: s.toLowerCase is not a function when the dataset had pandas MultiIndex columns. MultiIndex column names are stored as arrays (e.g., ["foo", "a"]) rather than strings, causing the search filter to call toLowerCase() on arrays.

Changes:

  • Updated hasFilter function to handle both string and array column names
  • Convert array names to dot-separated strings before search (e.g., ["foo", "a"]"foo.a")
  • Display array names as dot-separated strings in the table
  • Use inline type definition for schema prop instead of exported interface
  • Added 8 comprehensive unit tests covering regular and MultiIndex column scenarios

Example:

import pandas as pd
from mlflow.data.pandas_dataset import from_pandas

df = pd.DataFrame(dict(a=range(10), b=range(20, 30)))
df.columns = pd.MultiIndex.from_tuples([("foo", "a"), ("foo", "b")])
ds = from_pandas(df, name="test")
# Searching for "foo" now works correctly in the UI

Before (crashed):

Screen.Recording.2025-12-18.at.17.10.57.mov

After (works):

Searching for "foo" filters to foo.a and foo.b:
Search foo

Searching for "bar" filters to bar.c and bar.d:
Search bar

Regular datasets continue to work as expected:
Regular search

How is this PR tested?

  • Existing unit/integration tests
  • New unit/integration tests
  • Manual tests

Tests added:

  • 8 unit tests for ExperimentViewDatasetSchemaTable component
  • Tests cover regular columns, MultiIndex columns, filtering by name and type
  • All tests passing ✅

Does this PR require documentation update?

  • No. You can skip the rest of this section.
  • Yes. I've updated:
    • Examples
    • API references
    • Instructions

Release Notes

Is this a user-facing change?

  • No. You can skip the rest of this section.
  • Yes. Give a description of this change to be included in the release notes for MLflow users.

Fixed crash when searching dataset schema with pandas MultiIndex columns. Search now correctly handles hierarchical column names.

What component(s), interfaces, languages, and integrations does this PR affect?

Components

  • area/tracking: Tracking Service, tracking client APIs, autologging
  • area/models: MLmodel format, model serialization/deserialization, flavors
  • area/model-registry: Model Registry service, APIs, and the fluent client calls for Model Registry
  • area/scoring: MLflow Model server, model deployment tools, Spark UDFs
  • area/evaluation: MLflow model evaluation features, evaluation metrics, and evaluation workflows
  • area/gateway: MLflow AI Gateway client APIs, server, and third-party integrations
  • area/prompts: MLflow prompt engineering features, prompt templates, and prompt management
  • area/tracing: MLflow Tracing features, tracing APIs, and LLM tracing functionality
  • area/projects: MLproject format, project running backends
  • area/uiux: Front-end, user experience, plotting, JavaScript, JavaScript dev server
  • area/build: Build and test infrastructure for MLflow
  • area/docs: MLflow documentation pages

How should the PR be classified in the release notes? Choose one:

  • rn/none - No description will be included. The PR will be mentioned only by the PR number in the "Small Bugfixes and Documentation Updates" section
  • rn/breaking-change - The PR will be mentioned in the "Breaking Changes" section
  • rn/feature - A new user-facing feature worth mentioning in the release notes
  • rn/bug-fix - A user-facing bug fix worth mentioning in the release notes
  • rn/documentation - A user-facing documentation change worth mentioning in the release notes

Should this PR be included in the next patch release?

  • Yes (this PR will be cherry-picked and included in the next patch release)
  • No (this PR will be included in the next minor release)

💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

@harupy harupy marked this pull request as ready for review December 17, 2025 13:28
Copilot AI review requested due to automatic review settings December 17, 2025 13:28
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot wasn't able to review any files in this pull request.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

- Handle both string and array column names in the search filter
- Join array names with '.' for display and searching
- Tested with MultiIndex pandas datasets

Co-authored-by: harupy <17039389+harupy@users.noreply.github.com>
- Test rendering of regular and MultiIndex column names
- Test filtering by first and second levels of MultiIndex
- Test filtering by column type
- Test case-insensitive filtering
- Test "No results" message
All 8 tests passing

Co-authored-by: harupy <17039389+harupy@users.noreply.github.com>
- Remove unused userEvent import from test file
- Add SchemaColumn interface for better type safety
- Replace any[] type with SchemaColumn[] in component props

Co-authored-by: harupy <17039389+harupy@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix searching in pandas dataset with MultiIndex column Fix MultiIndex column search crash in dataset schema table Dec 17, 2025
Copilot AI requested a review from harupy December 17, 2025 13:47
@harupy
Copy link
Member

harupy commented Dec 17, 2025

/autoformat

Signed-off-by: mlflow-app[bot] <mlflow-app[bot]@users.noreply.github.com>
@github-actions
Copy link
Contributor

github-actions bot commented Dec 17, 2025

Documentation preview for dabc0c4 is available at:

More info
  • Ignore this comment if this PR does not change the documentation.
  • The preview is updated when a new commit is pushed to this PR.
  • This comment was created by this workflow run.
  • The documentation was built by this workflow run.

@harupy harupy added the rn/bug-fix Mention under Bug Fixes in Changelogs. label Dec 17, 2025
@harupy harupy requested a review from joelrobin18 December 17, 2025 13:57
@harupy harupy requested a review from hubertzub-db December 17, 2025 15:09
@github-actions github-actions bot added the area/uiux Front-end, user experience, plotting, JavaScript, JavaScript dev server label Dec 18, 2025
Copy link
Collaborator

@hubertzub-db hubertzub-db left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lg with a question

- Replace with inline type in ExperimentViewDatasetSchemaTableProps
- Remove unnecessary type annotations from filter and map callbacks
- Interface was only used internally, no need to export

Co-authored-by: harupy <17039389+harupy@users.noreply.github.com>
Copilot AI requested a review from harupy December 18, 2025 08:47
@harupy
Copy link
Member

harupy commented Dec 18, 2025

@copilot update the pr description

Copy link
Contributor Author

Copilot AI commented Dec 18, 2025

@copilot update the pr description

Updated the PR description to accurately reflect the final implementation, including:

  • Removed mention of SchemaColumn interface (replaced with inline type)
  • Added clearer description of the fix and changes
  • Updated screenshots section with better captions
  • Added test coverage details

@harupy harupy added this pull request to the merge queue Dec 18, 2025
Merged via the queue into master with commit 3649e6e Dec 18, 2025
59 checks passed
@harupy harupy deleted the copilot/fix-search-bug-multiindex branch December 18, 2025 09:35
WeichenXu123 pushed a commit to WeichenXu123/mlflow that referenced this pull request Dec 19, 2025
)

Signed-off-by: mlflow-app[bot] <mlflow-app[bot]@users.noreply.github.com>
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: harupy <17039389+harupy@users.noreply.github.com>
Co-authored-by: Harutaka Kawamura <hkawamura0130@gmail.com>
Co-authored-by: mlflow-app[bot] <mlflow-app[bot]@users.noreply.github.com>
WeichenXu123 pushed a commit that referenced this pull request Dec 19, 2025
Signed-off-by: mlflow-app[bot] <mlflow-app[bot]@users.noreply.github.com>
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: harupy <17039389+harupy@users.noreply.github.com>
Co-authored-by: Harutaka Kawamura <hkawamura0130@gmail.com>
Co-authored-by: mlflow-app[bot] <mlflow-app[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/uiux Front-end, user experience, plotting, JavaScript, JavaScript dev server rn/bug-fix Mention under Bug Fixes in Changelogs. v3.8.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] Searching in the "Search field" bar of a pandas dataset with a MultiIndex column throws an error

5 participants