Fix download_artifacts ignoring tracking_uri parameter#16461
Fix download_artifacts ignoring tracking_uri parameter#16461harupy merged 9 commits intomlflow:masterfrom
download_artifacts ignoring tracking_uri parameter#16461Conversation
When both run_id and artifact_path are specified, download_artifacts was ignoring the tracking_uri parameter and looking for artifacts in the wrong location. This fix ensures the tracking_uri is properly passed through to all artifact repository functions. Fixes mlflow#16412 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
|
Documentation preview for c5a27c1 will be available when this CircleCI job
More info
|
There was a problem hiding this comment.
Pull Request Overview
This PR fixes a regression in download_artifacts where the tracking_uri argument was not forwarded when both run_id and artifact_path were provided. It ensures tracking_uri is passed through all artifact repository lookups and download paths.
- Propagate
tracking_uriinto_download_artifact_from_uri,download_artifacts, andlist_artifactscalls. - Switch internal client calls to use
self.download_artifactsfor tracking URI consistency. - Extend
RunsArtifactRepository.get_underlying_urito accept and prioritizetracking_uri.
Reviewed Changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| tests/tracking/test_rest_tracking.py | Removed redundant mlflow.set_tracking_uri in test fixture |
| mlflow/tracking/client.py | Replaced global artifact download calls with self.download_artifacts |
| mlflow/tracking/artifact_utils.py | Added tracking_uri parameter and forwarded it to repository lookup |
| mlflow/store/artifact/runs_artifact_repo.py | Updated get_underlying_uri signature; adjusted constructor to use it and fixed client instantiation |
| mlflow/artifacts/init.py | Forwarded tracking_uri in high-level download_artifacts and list_artifacts |
Ensure that the tracking_uri is passed to get_artifact_repository when initializing the repository in RunsArtifactRepository.__init__. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
download_artifacts ignoring tracking_uri parameter
| ] | ||
|
|
||
| with _init_server(backend_uri, root_artifact_uri=tmp_path.as_uri()) as url: | ||
| mlflow.set_tracking_uri(backend_uri) |
download_artifacts ignoring tracking_uri parameterdownload_artifacts ignoring tracking_uri parameter
Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
- Update mock function in test_cli.py to accept tracking_uri parameter - Fix test assertions in test_runs_artifact_repo.py to use keyword arguments - Ensure RunsArtifactRepository preserves profile info from runs URI This fixes the test failures that occurred after adding tracking_uri parameter support to download_artifacts and related functions. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
|
|
Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
| uri = get_artifact_uri( | ||
| run_id=run_id, | ||
| artifact_path=artifact_path, | ||
| tracking_uri=tracking_uri_from_run or tracking_uri, |
There was a problem hiding this comment.
What if both tracking_uri_from_run and tracking_uri exist? 🤔
There was a problem hiding this comment.
Good point. To be conservative, I chose to use the return value of get_databricks_profile_uri_from_artifact_uri to avoid behavior changes.
Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
|
|
|
@rdotsch This PR should also fix |
|
@rdotsch Can you file a separate issue and link this PR? |
Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
| def test_list_artifacts_with_client_and_tracking_uri(tmp_path: pathlib.Path): | ||
| tracking_uri = f"sqlite:///{tmp_path}/mlflow-{uuid.uuid4().hex}.db" | ||
| assert mlflow.get_tracking_uri() != tracking_uri | ||
| client = mlflow.MlflowClient(tracking_uri) | ||
| experiment_id = client.create_experiment("my_experiment") | ||
| run = client.create_run(experiment_id) | ||
| tmp_dir = tmp_path / "subdir" | ||
| tmp_dir.mkdir() | ||
| tmp_file = tmp_dir / "file.txt" | ||
| tmp_file.touch() | ||
| client.log_artifacts(run.info.run_id, tmp_dir, "subdir") | ||
|
|
||
| artifacts = mlflow.artifacts.list_artifacts( | ||
| run_id=run.info.run_id, | ||
| tracking_uri=tracking_uri, | ||
| ) | ||
| assert [p.path for p in artifacts] == ["subdir"] | ||
|
|
||
| artifacts = mlflow.artifacts.list_artifacts( | ||
| run_id=run.info.run_id, | ||
| artifact_path="subdir", | ||
| tracking_uri=tracking_uri, | ||
| ) | ||
| assert [p.path for p in artifacts] == ["subdir/file.txt"] |
There was a problem hiding this comment.
@rdotsch Added a test for list_artifacts (which fails on master but doesn't on this branch).
There was a problem hiding this comment.
Thank you so much @harupy -- I verified in our setup and indeed this test passes. However, when I use our remote tracking_uri over https instead of a local sqlite database, this part without the artifact_path parameter passes:
artifacts = mlflow.artifacts.list_artifacts(
run_id=run.info.run_id,
tracking_uri=tracking_uri,
)
assert [p.path for p in artifacts] == ["subdir"]
But the last assertion using the artifact_path parameter does not pass:
artifacts = mlflow.artifacts.list_artifacts(
run_id=run.info.run_id,
artifact_path="subdir",
tracking_uri=tracking_uri,
)
assert [p.path for p in artifacts] == ["subdir/file.txt"]
It throws an exception at list_artifacts:
raise MlflowException(
mlflow.exceptions.MlflowException: API request to endpoint /api/2.0/mlflow/logged-models/search failed with error code 404 != 200. Response body: '<!doctype html>
<html lang=en>
<title>404 Not Found</title>
<h1>Not Found</h1>
<p>The requested URL was not found on the server. If you entered the URL manually please check your spelling and try again.</p>
'
There was a problem hiding this comment.
@rdotsch I think your tracking server uses mlflow 2.x. /api/2.0/mlflow/logged-models/search is a new endpoint added in mlflow 3.x.
There was a problem hiding this comment.
Also, wouldn’t the first assertion fail as well if that were the case? Or does it use a different endpoint when not supplying an artifact path?
There was a problem hiding this comment.
The first case doesn't call search models. Only the second case does.
There was a problem hiding this comment.
Can you call mlflow.search_logged_models? Does it fail with 404?
There was a problem hiding this comment.
I'll merge this PR to unblock my other work.
|
When is the next release which include this fix is planned to? |
Signed-off-by: harupy <17039389+harupy@users.noreply.github.com> Co-authored-by: Claude <noreply@anthropic.com>

🛠 DevTools 🛠
Install mlflow from this PR
For Databricks, use the following command:
Related Issues/PRs
Resolve #16412
Close #16423
What changes are proposed in this pull request?
Fix a regression in
download_artifactswhere thetracking_uriparameter was being ignored when bothrun_idandartifact_pathwere specified. This caused the function to look for artifacts in the wrong location (typically the current working directory's mlruns folder) instead of using the provided tracking URI.The fix ensures that the
tracking_uriparameter is properly passed through to all relevant artifact repository and download functions:_download_artifact_from_uri()now accepts and uses thetracking_uriparameterget_artifact_repository()calls now pass thetracking_uriparameterRunsArtifactRepository.get_underlying_uri()now accepts and properly uses the tracking URIself.download_artifacts()instead ofmlflow.artifacts.download_artifacts()How is this PR tested?
The issue included a reproduction case that now works correctly with these changes. The fix ensures that both scenarios in the reproduction case work:
download_artifacts(run_id=run_id, artifact_path=None, tracking_uri=tracking_uri)- already workeddownload_artifacts(run_id=run_id, artifact_path="my_subdir", tracking_uri=tracking_uri)- now works with the fixDoes this PR require documentation update?
Release Notes
Is this a user-facing change?
Fix
download_artifactsfunction to properly respect thetracking_uriparameter when bothrun_idandartifact_pathare specified. This resolves a regression introduced in v3.1.0 where the function would ignore the tracking URI and look for artifacts in the default location.What component(s), interfaces, languages, and integrations does this PR affect?
Components
area/artifacts: Artifact stores and artifact loggingarea/build: Build and test infrastructure for MLflowarea/deployments: MLflow Deployments client APIs, server, and third-party Deployments integrationsarea/docs: MLflow documentation pagesarea/examples: Example codearea/model-registry: Model Registry service, APIs, and the fluent client calls for Model Registryarea/models: MLmodel format, model serialization/deserialization, flavorsarea/projects: MLproject format, project running backendsarea/scoring: MLflow Model server, model deployment tools, Spark UDFsarea/server-infra: MLflow Tracking server backendarea/tracking: Tracking Service, tracking client APIs, autologgingInterface
area/uiux: Front-end, user experience, plotting, JavaScript, JavaScript dev serverarea/docker: Docker use across MLflow's components, such as MLflow Projects and MLflow Modelsarea/sqlalchemy: Use of SQLAlchemy in the Tracking Service or Model Registryarea/windows: Windows supportLanguage
language/r: R APIs and clientslanguage/java: Java APIs and clientslanguage/new: Proposals for new client languagesIntegrations
integrations/azure: Azure and Azure ML integrationsintegrations/sagemaker: SageMaker integrationsintegrations/databricks: Databricks integrationsHow should the PR be classified in the release notes? Choose one:
rn/none- No description will be included. The PR will be mentioned only by the PR number in the "Small Bugfixes and Documentation Updates" sectionrn/breaking-change- The PR will be mentioned in the "Breaking Changes" sectionrn/feature- A new user-facing feature worth mentioning in the release notesrn/bug-fix- A user-facing bug fix worth mentioning in the release notesrn/documentation- A user-facing documentation change worth mentioning in the release notesShould this PR be included in the next patch release?