Skip to content

Fix SQLAlchemy alias conflict in _search_runs for dataset filters#19498

Merged
harupy merged 3 commits intomlflow:masterfrom
fredericosantos:ui-fix-dataset-filtering-after-manual-filters
Dec 19, 2025
Merged

Fix SQLAlchemy alias conflict in _search_runs for dataset filters#19498
harupy merged 3 commits intomlflow:masterfrom
fredericosantos:ui-fix-dataset-filtering-after-manual-filters

Conversation

@fredericosantos
Copy link
Contributor

@fredericosantos fredericosantos commented Dec 18, 2025

🛠 DevTools 🛠

Open in GitHub Codespaces

Install mlflow from this PR

# mlflow
pip install git+https://github.com/mlflow/mlflow.git@refs/pull/19498/merge
# mlflow-skinny
pip install git+https://github.com/mlflow/mlflow.git@refs/pull/19498/merge#subdirectory=libs/skinny

For Databricks, use the following command:

%sh curl -LsSf https://raw.githubusercontent.com/mlflow/mlflow/HEAD/dev/install-skinny.sh | sh -s pull/19498/merge

Replaced hardcoded anon_n aliases with direct column references from the subquery. This prevents sqlite3.OperationalError when multiple filters are applied.

To reproduce the problem (prior to fix):

  1. Filter runs manually
  2. Apply dataset filter
  3. Check logs

I've been using this fix locally for a while now, but I'm quite scared to contribute, this is my first ever public PR, so apologies if things are not in the correct way.

Related Issues/PRs

#xxx

What changes are proposed in this pull request?

This PR fixes a database error in the SQLAlchemy store where combined filters (attributes/params + datasets) caused invalid SQL generation. Specifically, the code was manually constructing join conditions using string aliases like f"anon_{idx+1}". Since SQLAlchemy also uses an internal counter for anonymous aliases, these manual names frequently collided or referenced the wrong subquery when the total number of subqueries exceeded the number of dataset filters.

The fix replaces these manual strings with dataset_filter.c.destination_id, which allows SQLAlchemy to correctly track and resolve aliases regardless of join order or complexity.

How is this PR tested?

  • Manual tests

Does this PR require documentation update?

  • No. You can skip the rest of this section.

Release Notes

Is this a user-facing change?

  • Yes. Fixes a bug where searching runs with combined filters (e.g., parameters and datasets) would crash the tracking service when using SQL-backed stores.

What component(s), interfaces, languages, and integrations does this PR affect?

Components

  • area/tracking: Tracking Service, tracking client APIs, autologging

How should the PR be classified in the release notes? Choose one:

  • rn/bug-fix - A user-facing bug fix worth mentioning in the release notes

Should this PR be included in the next patch release?

  • Yes (this PR will be cherry-picked and included in the next patch release)
  • No (this PR will be included in the next minor release)

Replaced hardcoded `anon_n` aliases with direct column references from the subquery. This prevents sqlite3.OperationalError when multiple filters are applied.

To reproduce the problem (prior to fix):
1. Filter runs manually
2. Apply dataset filter
3. Check logs

I've been using this fix locally for a while now, but I'm quite scared to contribute to repositories (also had to figure out how to do it). So apologies if things are not in the correct way.

Signed-off-by: Frederico Santos <1119791+fredericosantos@users.noreply.github.com>
Copilot AI review requested due to automatic review settings December 18, 2025 21:51
@github-actions github-actions bot added area/tracking Tracking service, tracking client APIs, autologging rn/bug-fix Mention under Bug Fixes in Changelogs. v3.8.0 labels Dec 18, 2025
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a critical bug in the SQLAlchemy tracking store where combining dataset filters with other filter types (parameters, metrics, tags) would cause database errors due to alias collisions. The fix replaces hardcoded string-based SQL alias references with SQLAlchemy's proper column reference syntax, allowing the ORM to correctly track and resolve subquery aliases.

Key Changes:

  • Removed manual enumeration and hardcoded anon_{idx+1} alias construction
  • Replaced text(f"runs.run_uuid = {anon_table_name}.destination_id") with proper SQLAlchemy expression SqlRun.run_uuid == dataset_filter.c.destination_id
  • Aligned dataset filter join logic with the established pattern used elsewhere in the codebase (e.g., search_traces)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@harupy
Copy link
Member

harupy commented Dec 18, 2025

@fredericosantos Thanks for the PR! Do you have a minimum example that can reproduce the error?

@github-actions
Copy link
Contributor

github-actions bot commented Dec 18, 2025

Documentation preview for 6187939 is available at:

More info
  • Ignore this comment if this PR does not change the documentation.
  • The preview is updated when a new commit is pushed to this PR.
  • This comment was created by this workflow run.
  • The documentation was built by this workflow run.

Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
@harupy harupy added the team-review Trigger a team review request label Dec 19, 2025
@harupy harupy enabled auto-merge December 19, 2025 02:48
@harupy harupy added this pull request to the merge queue Dec 19, 2025
Merged via the queue into mlflow:master with commit 6bd4d3f Dec 19, 2025
55 of 58 checks passed
WeichenXu123 pushed a commit to WeichenXu123/mlflow that referenced this pull request Dec 19, 2025
…lflow#19498)

Signed-off-by: Frederico Santos <1119791+fredericosantos@users.noreply.github.com>
Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
Co-authored-by: harupy <17039389+harupy@users.noreply.github.com>
@fredericosantos fredericosantos deleted the ui-fix-dataset-filtering-after-manual-filters branch December 19, 2025 09:39
WeichenXu123 pushed a commit that referenced this pull request Dec 19, 2025
…19498)

Signed-off-by: Frederico Santos <1119791+fredericosantos@users.noreply.github.com>
Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
Co-authored-by: harupy <17039389+harupy@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/tracking Tracking service, tracking client APIs, autologging rn/bug-fix Mention under Bug Fixes in Changelogs. team-review Trigger a team review request v3.8.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants