Skip to content

[Proxied artifact operations] Implement REST API endpoints and artifact repository#4946

Merged
harupy merged 70 commits into
mlflow:masterfrom
harupy:proxied-artifact-opertions-endpoints
Nov 8, 2021
Merged

[Proxied artifact operations] Implement REST API endpoints and artifact repository#4946
harupy merged 70 commits into
mlflow:masterfrom
harupy:proxied-artifact-opertions-endpoints

Conversation

@harupy

@harupy harupy commented Oct 26, 2021

Copy link
Copy Markdown
Member

Signed-off-by: harupy 17039389+harupy@users.noreply.github.com

What changes are proposed in this pull request?

Implement REST API endpoints + artifact repository.

How is this patch tested?

Unit tests

Release Notes

Is this a user-facing change?

  • No. You can skip the rest of this section.
  • Yes. Give a description of this change to be included in the release notes for MLflow users.

(Details in 1-2 sentences. You can just refer to another PR with a description if this PR is part of a larger change.)

What component(s), interfaces, languages, and integrations does this PR affect?

Components

  • area/artifacts: Artifact stores and artifact logging
  • area/build: Build and test infrastructure for MLflow
  • area/docs: MLflow documentation pages
  • area/examples: Example code
  • area/model-registry: Model Registry service, APIs, and the fluent client calls for Model Registry
  • area/models: MLmodel format, model serialization/deserialization, flavors
  • area/projects: MLproject format, project running backends
  • area/scoring: MLflow Model server, model deployment tools, Spark UDFs
  • area/server-infra: MLflow Tracking server backend
  • area/tracking: Tracking Service, tracking client APIs, autologging

Interface

  • area/uiux: Front-end, user experience, plotting, JavaScript, JavaScript dev server
  • area/docker: Docker use across MLflow's components, such as MLflow Projects and MLflow Models
  • area/sqlalchemy: Use of SQLAlchemy in the Tracking Service or Model Registry
  • area/windows: Windows support

Language

  • language/r: R APIs and clients
  • language/java: Java APIs and clients
  • language/new: Proposals for new client languages

Integrations

  • integrations/azure: Azure and Azure ML integrations
  • integrations/sagemaker: SageMaker integrations
  • integrations/databricks: Databricks integrations

How should the PR be classified in the release notes? Choose one:

  • rn/breaking-change - The PR will be mentioned in the "Breaking Changes" section
  • rn/none - No description will be included. The PR will be mentioned only by the PR number in the "Small Bugfixes and Documentation Updates" section
  • rn/feature - A new user-facing feature worth mentioning in the release notes
  • rn/bug-fix - A user-facing bug fix worth mentioning in the release notes
  • rn/documentation - A user-facing documentation change worth mentioning in the release notes

@harupy harupy changed the title [Proxied artifact operations] Implement REST API endpoints + artifact repository [WIP][Proxied artifact operations] Implement REST API endpoints + artifact repository Oct 26, 2021
@github-actions github-actions Bot added the rn/feature Mention under Features in Changelogs. label Oct 26, 2021
@harupy harupy changed the title [WIP][Proxied artifact operations] Implement REST API endpoints + artifact repository [WIP][Proxied artifact operations] Implement REST API endpoints and artifact repository Oct 26, 2021
Comment thread mlflow/protos/mlflow_artifacts.proto
Comment thread tests/tracking/test_mlflow_artifacts.py Outdated
return f.read()


def test_log_artifact(tmpdir):

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a minimal integration test to make sure an aritafct can be logged using the mlflow artifacts service.

Comment thread mlflow/protos/mlflow_artifacts.proto
Comment thread mlflow/server/handlers.py
Comment thread mlflow/protos/mlflow_artifacts.proto Outdated

@dbczumar dbczumar left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@harupy This looks like an excellent start! Awesome stuff!

@harupy harupy force-pushed the proxied-artifact-opertions-endpoints branch from d138f38 to d32c6a4 Compare October 29, 2021 15:48
@harupy harupy changed the title [WIP][Proxied artifact operations] Implement REST API endpoints and artifact repository [Proxied artifact operations] Implement REST API endpoints and artifact repository Oct 29, 2021
@harupy harupy requested a review from dbczumar November 1, 2021 00:03
Comment thread examples/mlflow_artifacts/README.md Outdated
harupy and others added 17 commits November 1, 2021 22:50
Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
Signed-off-by: harupy <hkawamura0130@gmail.com>
Signed-off-by: harupy <hkawamura0130@gmail.com>
Signed-off-by: harupy <hkawamura0130@gmail.com>
Signed-off-by: harupy <hkawamura0130@gmail.com>
Signed-off-by: harupy <hkawamura0130@gmail.com>
Signed-off-by: harupy <hkawamura0130@gmail.com>
Signed-off-by: harupy <hkawamura0130@gmail.com>
Signed-off-by: harupy <hkawamura0130@gmail.com>
Signed-off-by: harupy <hkawamura0130@gmail.com>
Signed-off-by: harupy <hkawamura0130@gmail.com>
Signed-off-by: harupy <hkawamura0130@gmail.com>
Signed-off-by: harupy <hkawamura0130@gmail.com>
Signed-off-by: harupy <hkawamura0130@gmail.com>
Signed-off-by: harupy <hkawamura0130@gmail.com>
Comment thread examples/mlflow_artifacts/README.md Outdated
Comment on lines +5 to +16
This example requires a wheel for `mlflow` to be stored in `dist`.

```sh
# Clean up existing wheels
rm dist/*

# Build a wheel for the development version of mlflow
pip wheel --no-deps --wheel-dir dist ../..

# Build a wheel for the latest version of mlflow on PyPI
pip wheel --no-deps --wheel-dir dist mlflow
```

@dbczumar dbczumar Nov 4, 2021

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are these steps required? If there's a way to avoid them, that would be nice in order to reduce complexity for users.

@harupy harupy Nov 5, 2021

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is mainly for development purposes. We can simplify the workflow by preparing two Docker files (Dockerfile and Dockerfile.dev) and switching them with variable substitution.

docker-compose.yml would look like:

# docker-compose.yml
  tracking-server:
    build:
      context: .
      # Defaults to `Dockerfile` if `DOCKERFILE` environment variable doesn't exist.
      dockerfile: "${DOCKERFILE:-Dockerfile}"
    depends_on:
      - postgres
      - artifacts-server

README.md would look like:

This directory contains a set of files for demonstrating the MLflow Artifacts service.

## Run the example

```sh
# Build services
docker-compose build

# Launch tracking and artifacts servers
docker-compose up -d

# Run `run.py` that uploads, downloads, and list artifacts
docker-compose run -v ${PWD}/run.py:/app/run.py client python run.py
```

## Explore the logging results:

```sh
# Make sure both tracking and artifacts servers are running
docker-compose ps
```

- MLflow UI is available at http://localhost:5000 to explore tracking results.
- MinIO Console is available at http://localhost:9001 to explore logged artifacts. The login username and password are `user` and `password`.

## Reset tracking and artifacts servers

```
docker-compose down --volumes --remove-orphans
```

👇👇👇

## Development

```
pip wheel --no-deps --wheel-dir dist ../..
DOCKERFILE=Dockerfile.dev docker-compose build
docker-compose run -v ${PWD}/run.py:/app/run.py client python run.py
```

What do you think about this approach?

@harupy harupy Nov 5, 2021

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dockerfile and Dockerfile.dev would look like this.

Dockerfile:

FROM python:3.6

WORKDIR /app

RUN pip install psycopg2 boto3
RUN pip install mlflow

Dockerfile.dev:

FROM python:3.6

WORKDIR /app

RUN pip install psycopg2 boto3
RUN pip install mlflow
COPY dist ./dist
RUN pip install dist/mlflow-*.whl

Comment thread mlflow/cli.py
metavar="URI",
default="./mlartifacts",
help=(
"The base artifact location from which to resolve artifact upload/download/list requests "

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love it! Can we also clarify that this only applies when the tracking server is configured to stream artifacts and the experiment artifact root is http or mlflow-artifacts?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, we should figure out how to tell users about the http and mlflow-artifacts URIs for --default-artifact-root as well.

@harupy harupy Nov 5, 2021

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love it! Can we also clarify that this only applies when the tracking server is configured to stream artifacts and the experiment artifact root is http or mlflow-artifacts?

Sure!

In general, we should figure out how to tell users about the http and mlflow-artifacts URIs for --default-artifact-root as well.

Can we new sections for HTTP and mlflow-artifacts schemes in https://www.mlflow.org/docs/latest/tracking.html#artifact-stores?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, we should figure out how to tell users about the http and mlflow-artifacts URIs for --default-artifact-root as well.

That's a good idea. We should also make it easy for users to learn about these through the mlflow server CLI in a follow-up PR.

Comment thread mlflow/server/handlers.py
with open(tmp_path, "wb") as f:
chunk_size = 1024 * 1024 # 1 MB
while True:
chunk = request.stream.read(chunk_size)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any timeout on the stream read operation? Else, we may get stuck here forever.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • request.stream is an instance of gunicorn.http.body.Body.
  • gunicorn.http.body.Body.read doesn't seem to have an option related to timeout:

Source: https://github.com/benoitc/gunicorn/blob/ff58e0c6da83d5520916bc4cc109a529258d76e1/gunicorn/http/body.py#L202

    def read(self, size=None):
        size = self.getsize(size)
        ...

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. This probably isn't too big a deal because we set gunicorn worker timeouts. Worst case, we may hang here until the worker times out, which is fine :).


def _download_file(self, remote_file_path, local_path):
url = posixpath.join(self.artifact_uri, remote_file_path)
with self._session.get(url, stream=True) as resp:

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a timeout we should specify here? Else, we may get stuck here

@harupy harupy Nov 5, 2021

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, what about 10 minutes?. I'm checking the behavior of timeout when stream=True.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found a related discussion:

psf/requests#1803 (comment)

image

@harupy harupy Nov 5, 2021

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we set timeout=10 (seconds)?

@@ -0,0 +1,80 @@
syntax = "proto2";

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a docstring to this proto file explaining which use cases it serves?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure!

@dbczumar dbczumar left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@harupy Looks great! Happy to approve once remaining comments are addressed and tests pass.

Comment thread examples/mlflow_artifacts/README.md Outdated
@@ -0,0 +1,45 @@
This directory contains a set of files for demonstrating the MLflow Artifacts service.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we explain what the artifacts service does and provide a small example command that users could run manually to serve artifacts via mlflow server in the README?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the README!

Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
Comment thread mlflow/store/artifact/http_artifact_repo.py Outdated
Signed-off-by: harupy <hkawamura0130@gmail.com>
Signed-off-by: harupy <hkawamura0130@gmail.com>
Signed-off-by: harupy <hkawamura0130@gmail.com>
Signed-off-by: harupy <hkawamura0130@gmail.com>
Signed-off-by: harupy <hkawamura0130@gmail.com>
Signed-off-by: harupy <hkawamura0130@gmail.com>
Signed-off-by: harupy <hkawamura0130@gmail.com>
Signed-off-by: harupy <hkawamura0130@gmail.com>
Comment thread examples/mlflow_artifacts/README.md
Signed-off-by: harupy <hkawamura0130@gmail.com>
Signed-off-by: harupy <hkawamura0130@gmail.com>
Signed-off-by: harupy <hkawamura0130@gmail.com>
Signed-off-by: harupy <hkawamura0130@gmail.com>
Signed-off-by: harupy <hkawamura0130@gmail.com>

@dbczumar dbczumar left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM again, thanks @harupy ! Really awesome!

Comment thread mlflow/server/handlers.py
with open(tmp_path, "wb") as f:
chunk_size = 1024 * 1024 # 1 MB
while True:
chunk = request.stream.read(chunk_size)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. This probably isn't too big a deal because we set gunicorn worker timeouts. Worst case, we may hang here until the worker times out, which is fine :).

@harupy

harupy commented Nov 8, 2021

Copy link
Copy Markdown
Member Author

@dbczumar Thanks for reviewing this PR!

@harupy harupy merged commit a05360c into mlflow:master Nov 8, 2021
@harupy harupy deleted the proxied-artifact-opertions-endpoints branch November 8, 2021 23:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

rn/feature Mention under Features in Changelogs.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants