Skip to content

Fix download_artifacts ignoring tracking_uri parameter#16461

Merged
harupy merged 9 commits intomlflow:masterfrom
harupy:fix-download_artifacts-ignores-tracking-URI
Jun 30, 2025
Merged

Fix download_artifacts ignoring tracking_uri parameter#16461
harupy merged 9 commits intomlflow:masterfrom
harupy:fix-download_artifacts-ignores-tracking-URI

Conversation

@harupy
Copy link
Member

@harupy harupy commented Jun 26, 2025

🛠 DevTools 🛠

Open in GitHub Codespaces

Install mlflow from this PR

# mlflow
pip install git+https://github.com/mlflow/mlflow.git@refs/pull/16461/merge
# mlflow-skinny
pip install git+https://github.com/mlflow/mlflow.git@refs/pull/16461/merge#subdirectory=skinny

For Databricks, use the following command:

%sh curl -LsSf https://raw.githubusercontent.com/mlflow/mlflow/HEAD/dev/install-skinny.sh | sh -s pull/16461/merge

Related Issues/PRs

Resolve #16412
Close #16423

What changes are proposed in this pull request?

Fix a regression in download_artifacts where the tracking_uri parameter was being ignored when both run_id and artifact_path were specified. This caused the function to look for artifacts in the wrong location (typically the current working directory's mlruns folder) instead of using the provided tracking URI.

The fix ensures that the tracking_uri parameter is properly passed through to all relevant artifact repository and download functions:

  • _download_artifact_from_uri() now accepts and uses the tracking_uri parameter
  • get_artifact_repository() calls now pass the tracking_uri parameter
  • RunsArtifactRepository.get_underlying_uri() now accepts and properly uses the tracking URI
  • Fixed client method calls to use self.download_artifacts() instead of mlflow.artifacts.download_artifacts()

How is this PR tested?

  • Existing unit/integration tests
  • New unit/integration tests
  • Manual tests

The issue included a reproduction case that now works correctly with these changes. The fix ensures that both scenarios in the reproduction case work:

  1. download_artifacts(run_id=run_id, artifact_path=None, tracking_uri=tracking_uri) - already worked
  2. download_artifacts(run_id=run_id, artifact_path="my_subdir", tracking_uri=tracking_uri) - now works with the fix

Does this PR require documentation update?

  • No. You can skip the rest of this section.
  • Yes. I've updated:
    • Examples
    • API references
    • Instructions

Release Notes

Is this a user-facing change?

  • No. You can skip the rest of this section.
  • Yes. Give a description of this change to be included in the release notes for MLflow users.

Fix download_artifacts function to properly respect the tracking_uri parameter when both run_id and artifact_path are specified. This resolves a regression introduced in v3.1.0 where the function would ignore the tracking URI and look for artifacts in the default location.

What component(s), interfaces, languages, and integrations does this PR affect?

Components

  • area/artifacts: Artifact stores and artifact logging
  • area/build: Build and test infrastructure for MLflow
  • area/deployments: MLflow Deployments client APIs, server, and third-party Deployments integrations
  • area/docs: MLflow documentation pages
  • area/examples: Example code
  • area/model-registry: Model Registry service, APIs, and the fluent client calls for Model Registry
  • area/models: MLmodel format, model serialization/deserialization, flavors
  • area/projects: MLproject format, project running backends
  • area/scoring: MLflow Model server, model deployment tools, Spark UDFs
  • area/server-infra: MLflow Tracking server backend
  • area/tracking: Tracking Service, tracking client APIs, autologging

Interface

  • area/uiux: Front-end, user experience, plotting, JavaScript, JavaScript dev server
  • area/docker: Docker use across MLflow's components, such as MLflow Projects and MLflow Models
  • area/sqlalchemy: Use of SQLAlchemy in the Tracking Service or Model Registry
  • area/windows: Windows support

Language

  • language/r: R APIs and clients
  • language/java: Java APIs and clients
  • language/new: Proposals for new client languages

Integrations

  • integrations/azure: Azure and Azure ML integrations
  • integrations/sagemaker: SageMaker integrations
  • integrations/databricks: Databricks integrations

How should the PR be classified in the release notes? Choose one:

  • rn/none - No description will be included. The PR will be mentioned only by the PR number in the "Small Bugfixes and Documentation Updates" section
  • rn/breaking-change - The PR will be mentioned in the "Breaking Changes" section
  • rn/feature - A new user-facing feature worth mentioning in the release notes
  • rn/bug-fix - A user-facing bug fix worth mentioning in the release notes
  • rn/documentation - A user-facing documentation change worth mentioning in the release notes

Should this PR be included in the next patch release?

  • Yes (this PR will be cherry-picked and included in the next patch release)
  • No (this PR will be included in the next minor release)

When both run_id and artifact_path are specified, download_artifacts
was ignoring the tracking_uri parameter and looking for artifacts
in the wrong location. This fix ensures the tracking_uri is properly
passed through to all artifact repository functions.

Fixes mlflow#16412

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
Copilot AI review requested due to automatic review settings June 26, 2025 09:46
@github-actions github-actions bot added area/artifacts Artifact stores and artifact logging area/tracking Tracking service, tracking client APIs, autologging rn/bug-fix Mention under Bug Fixes in Changelogs. labels Jun 26, 2025
@github-actions
Copy link
Contributor

github-actions bot commented Jun 26, 2025

Documentation preview for c5a27c1 will be available when this CircleCI job
completes successfully. You may encounter a {"message":"not found"} error when reloading
a page. If so, add /index.html to the URL.

More info

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes a regression in download_artifacts where the tracking_uri argument was not forwarded when both run_id and artifact_path were provided. It ensures tracking_uri is passed through all artifact repository lookups and download paths.

  • Propagate tracking_uri into _download_artifact_from_uri, download_artifacts, and list_artifacts calls.
  • Switch internal client calls to use self.download_artifacts for tracking URI consistency.
  • Extend RunsArtifactRepository.get_underlying_uri to accept and prioritize tracking_uri.

Reviewed Changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
tests/tracking/test_rest_tracking.py Removed redundant mlflow.set_tracking_uri in test fixture
mlflow/tracking/client.py Replaced global artifact download calls with self.download_artifacts
mlflow/tracking/artifact_utils.py Added tracking_uri parameter and forwarded it to repository lookup
mlflow/store/artifact/runs_artifact_repo.py Updated get_underlying_uri signature; adjusted constructor to use it and fixed client instantiation
mlflow/artifacts/init.py Forwarded tracking_uri in high-level download_artifacts and list_artifacts

Ensure that the tracking_uri is passed to get_artifact_repository
when initializing the repository in RunsArtifactRepository.__init__.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
@harupy harupy changed the title Fix download_artifacts ignoring tracking_uri parameter Fix download_artifacts ignoring tracking_uri parameter Jun 26, 2025
]

with _init_server(backend_uri, root_artifact_uri=tmp_path.as_uri()) as url:
mlflow.set_tracking_uri(backend_uri)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To address #16423 (review)

@harupy harupy requested a review from TomeHirata June 26, 2025 09:59
@harupy harupy changed the title Fix download_artifacts ignoring tracking_uri parameter Fix download_artifacts ignoring tracking_uri parameter Jun 26, 2025
harupy and others added 3 commits June 26, 2025 19:21
Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
- Update mock function in test_cli.py to accept tracking_uri parameter
- Fix test assertions in test_runs_artifact_repo.py to use keyword arguments
- Ensure RunsArtifactRepository preserves profile info from runs URI

This fixes the test failures that occurred after adding tracking_uri
parameter support to download_artifacts and related functions.

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
@harupy
Copy link
Member Author

harupy commented Jun 27, 2025

tests/tracing/test_fluent.py::test_mlflow_trace_isolated_from_other_otel_processors failure is unrelated.

@harupy harupy added the team-review Trigger a team review request label Jun 27, 2025
Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
uri = get_artifact_uri(
run_id=run_id,
artifact_path=artifact_path,
tracking_uri=tracking_uri_from_run or tracking_uri,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if both tracking_uri_from_run and tracking_uri exist? 🤔

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. To be conservative, I chose to use the return value of get_databricks_profile_uri_from_artifact_uri to avoid behavior changes.

Copy link
Collaborator

@serena-ruan serena-ruan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM!

Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
@rdotsch
Copy link

rdotsch commented Jun 27, 2025

list_artifacts has the identical issue. Will the changes in this PR fix that as well, or should I file that as a separate issue?

@harupy
Copy link
Member Author

harupy commented Jun 27, 2025

@rdotsch This PR should also fix list_artifacts. Can you verify?

# mlflow
pip install git+https://github.com/mlflow/mlflow.git@refs/pull/16461/merge
# mlflow-skinny
pip install git+https://github.com/mlflow/mlflow.git@refs/pull/16461/merge#subdirectory=skinny

@harupy
Copy link
Member Author

harupy commented Jun 27, 2025

@rdotsch Can you file a separate issue and link this PR?

Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
Comment on lines +313 to +336
def test_list_artifacts_with_client_and_tracking_uri(tmp_path: pathlib.Path):
tracking_uri = f"sqlite:///{tmp_path}/mlflow-{uuid.uuid4().hex}.db"
assert mlflow.get_tracking_uri() != tracking_uri
client = mlflow.MlflowClient(tracking_uri)
experiment_id = client.create_experiment("my_experiment")
run = client.create_run(experiment_id)
tmp_dir = tmp_path / "subdir"
tmp_dir.mkdir()
tmp_file = tmp_dir / "file.txt"
tmp_file.touch()
client.log_artifacts(run.info.run_id, tmp_dir, "subdir")

artifacts = mlflow.artifacts.list_artifacts(
run_id=run.info.run_id,
tracking_uri=tracking_uri,
)
assert [p.path for p in artifacts] == ["subdir"]

artifacts = mlflow.artifacts.list_artifacts(
run_id=run.info.run_id,
artifact_path="subdir",
tracking_uri=tracking_uri,
)
assert [p.path for p in artifacts] == ["subdir/file.txt"]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rdotsch Added a test for list_artifacts (which fails on master but doesn't on this branch).

Copy link

@rdotsch rdotsch Jun 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much @harupy -- I verified in our setup and indeed this test passes. However, when I use our remote tracking_uri over https instead of a local sqlite database, this part without the artifact_path parameter passes:

   artifacts = mlflow.artifacts.list_artifacts(
        run_id=run.info.run_id,
        tracking_uri=tracking_uri,
    )
    assert [p.path for p in artifacts] == ["subdir"]

But the last assertion using the artifact_path parameter does not pass:

artifacts = mlflow.artifacts.list_artifacts(
        run_id=run.info.run_id,
        artifact_path="subdir",
        tracking_uri=tracking_uri,
    )
    assert [p.path for p in artifacts] == ["subdir/file.txt"]

It throws an exception at list_artifacts:

    raise MlflowException(
mlflow.exceptions.MlflowException: API request to endpoint /api/2.0/mlflow/logged-models/search failed with error code 404 != 200. Response body: '<!doctype html>
<html lang=en>
<title>404 Not Found</title>
<h1>Not Found</h1>
<p>The requested URL was not found on the server. If you entered the URL manually please check your spelling and try again.</p>
'

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rdotsch I think your tracking server uses mlflow 2.x. /api/2.0/mlflow/logged-models/search is a new endpoint added in mlflow 3.x.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We had already updated the tracking server to 3.1.1 this morning.
Screenshot 2025-06-27 at 17 42 38

I'll look into whether I can find something wrong with our setup on Monday. Otherwise I'll file a new issue referencing this PR.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, wouldn’t the first assertion fail as well if that were the case? Or does it use a different endpoint when not supplying an artifact path?

Copy link
Member Author

@harupy harupy Jun 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first case doesn't call search models. Only the second case does.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Thank you.

Copy link
Member Author

@harupy harupy Jun 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you call mlflow.search_logged_models? Does it fail with 404?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll merge this PR to unblock my other work.

@harupy harupy enabled auto-merge June 30, 2025 02:35
@harupy harupy added this pull request to the merge queue Jun 30, 2025
Merged via the queue into mlflow:master with commit 4d66c30 Jun 30, 2025
47 checks passed
@harupy harupy deleted the fix-download_artifacts-ignores-tracking-URI branch June 30, 2025 02:37
@guybartal
Copy link

When is the next release which include this fix is planned to?

@harupy harupy added the v3.1.2 label Jul 2, 2025
B-Step62 pushed a commit to B-Step62/mlflow that referenced this pull request Jul 8, 2025
)

Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
Co-authored-by: Claude <noreply@anthropic.com>
B-Step62 pushed a commit that referenced this pull request Jul 8, 2025
Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
Co-authored-by: Claude <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/artifacts Artifact stores and artifact logging area/tracking Tracking service, tracking client APIs, autologging rn/bug-fix Mention under Bug Fixes in Changelogs. team-review Trigger a team review request v3.1.2

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] download_artifacts ignores tracking URI

5 participants