MLflow is an critical open-source library for managing machine learning lifecycle including experimentation, reproducibility and model deployment. One of its most important components is MLflow Models – a standardized packaging format for persisting models and deploying them to production.

In this comprehensive guide, we‘ll cover the end-to-end workflow of training, saving and deploying a model with MLflow‘s features:

Overview of MLflow Components

MLflow consists of 4 main components:

MLflow Tracking: Records and tracks model experiments by logging parameters, metrics, artifacts, etc during runs. Integrates with TensorBoard for visualization.

MLflow Projects: Packages data science code in a reproducible format to run experiments locally or on cloud platforms like SageMaker.

MLflow Models: Saves trained models in a portable, reusable format with metadata and environment specifications for deployment.

Model Registry: Central model store for organizing, tracking experiments and registering models for production.

Let‘s learn how each piece fits together in the machine learning lifecycle.

Saving Models with mlflow.log_model API

The key functionality MLflow provides is the ability to save models in a reusable, portable format called MLflow Models.

This self-contained packaging structure enables exporting models and deploying them on different downstream platforms. MLflow Models contain:

  • Model Files: Serialized files representing the actual trained algorithm like estimator, coefficients
  • MLmodel File: Metadata on the model like name, version, framework info
  • Conda Environment: Conda YAML listing model‘s software dependencies
  • Signature: Input and output schema of the model

We can save models programatically in this format using the mlflow.log_model() method:

import mlflow

mlflow.log_model(model, artifact_path="model", conda_env="conda.yaml") 

The model can be a scikit-learn model, PyTorch model etc. MLflow will infer the flavor and serialize the model accordingly.

Model Flavors

MLflow supports saving models in different "flavors" like:

  • Python Function (pyfunc): General Python models
  • TensorFlow: tf.Estimator, Keras models
  • PyTorch: PyTorch models
  • scikit-learn: Scikit-learn models
  • spaCy: spaCy NLP models
  • R: R models with CRAN format

Based on the framework, MLflow handles the serialization automatically when logging models.

Let‘s see an example training and saving a PyTorch model:

import mlflow
import mlflow.pytorch
import torch

# Train PyTorch model
model = Net()
model.train()

# Save model 
mlflow.pytorch.log_model(model, artifact_path="pytorch-model")

This handles serializing the PyTorch model format into the MLflow artifact directory.

We can also save models without a specific flavor as a "python function" flavor and call predictions generally:

import mlflow.pyfunc  

mlflow.pyfunc.log_model(model, artifact_path="model", loader_module=loader_module)

Now let‘s go through a full use case of training, evaluate and saving machine learning models step-by-step using MLflow!

End-to-End Example: Classifying Images with Convolutional Neural Network (CNN)

To demonstrate the full model management lifecycle with MLflow, we‘ll work through an image classification example using Keras and Tensorflow. The goal is to train a CNN to classify fashion images into categories.

The steps we‘ll cover are:

  1. Installing MLflow
  2. Coding CNN model with Keras
  3. Training and evaluating model
  4. Saving and logging metrics with MLflow
  5. Registering model with registry
  6. Deploying model to new environment

1. Install MLflow

pip install mlflow

2. Build & Train Keras CNN Model

First, we‘ll train a convolutional neural network on the Fashion MNIST image dataset using Keras:

import keras
from keras.layers import Conv2D, MaxPooling2D, Flatten, Dense  

# Build 3 layer CNN model
model = keras.models.Sequential()
model.add(Conv2D(32, 3, activation=‘relu‘, input_shape=(28,28,1))) 
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(10, activation=‘softmax‘))   

# Train model on fashion image data
model.compile(loss="categorical_crossentropy", optimizer="adam", metrics=["accuracy"])
model.fit(train_images, train_labels, epochs=5, validation_data=(test_images, test_labels))  

This trains a CNN for classifying clothing types.

3. Evaluate Model Metrics

Now we can evaluate the model accuracy on test fashion images:

test_loss, test_accuracy = model.evaluate(test_images, test_labels)

print("Test loss: {}, Test accuracy: {}".format(test_loss, test_accuracy))

This gives us metrics like Test loss: 0.52, Test accuracy: 0.87

4. Log MLflow Run

We‘ll log the model, parameters and metrics using an MLflow run:

import mlflow
import mlflow.tensorflow 

with mlflow.start_run() as run:

    # Log Keras CNN model 
    mlflow.tensorflow.log_model(model, artifact_path="cnn-fashion")

    # Log params and metrics
    mlflow.log_param("epochs", 5)  
    mlflow.log_metric("test_loss", test_loss)
    mlflow.log_metric("test_accuracy", test_accuracy) 

    run_id = run.info.run_uuid

This will save the model as a MLflow Model artifact that can be deployed later.

5. Register Model with Registry

Next, we can add the model to the MLflow Model Registry:

mlflow.register_model(f"runs:/{run_id}/cnn-fashion", "cnn-fashion-model")

Now this model can be tracked, audited, approved and deployed from the registry.

6. Deploy Model to Production

Finally, we can export the saved model in MLflow format and deploy it to a production environment using TensorFlow Serving docker container:

mlflow models serve -m "./cnn-fashion" -p 1234

docker run -p 8501:8501 tfserving-model

This deploys the CNN model as a performant REST endpoint accessible from the tf serving container for real-time prediction requests!

This demonstrates the full life cycle of an ML model from training to persistence to deployment leveraging MLflow!

Benefits of Standardized Model Format

The portability of MLflow‘s model packaging provides many advantages:

Version Control: Can save, update, revert model versions throughout experiments.

Reproducibility: All model artifacts, parameters captured to rerun training experiments.

Organization: Model Registry enables easily tracking experiments across teams.

Deployment: Package models and serve them via Docker, REST APIs in a portable manner.

CI/CD Pipelines: Integration with Jenkins, Airflow etc to enable Continuous Training, Deployment workflows.

Alternatives to MLflow for Model Persistence

There are a few alternatives to MLflow Models for standard model serialization:

ONNX: Open Neural Network Exchange format focused only on neural net models unlike MLflow generic format. Does not include conda environments in model packaging.

PMML: Predictive Model Markup Language records models via XML. Limited framework support mostly for traditional models.

MLflow provides the best of both worlds – a generic packaging format integrated deeply with popular DL frameworks like TensorFlow, PyTorch and scikit-learn for convenience.

Saving Models with Different Frameworks

While MLflow mostly auto-serializes models, we can use utility methods for explicitly saving models in specialized frameworks:

PyTorch

import mlflow.pytorch

mlflow.pytorch.log_model(pytorch_model, artifact_path) 

XGBoost

import mlflow.xgboost

mlflow.xgboost.log_model(xgb_model, artifact_path)

SparkML

import mlflow.spark

mlflow.spark.log_model(sparkml_model, artifact_path)

TensorFlow (Keras)

import mlflow.tensorflow

mlflow.tensorflow.autolog(log_models=True)
mlflow.tensorflow.log_model(keras_model, artifact_path)

The benefit here is portability across runtimes. For example, a PyTorch model saved with MLflow can be loaded back as a TensorFlow model for serving!

Advanced Features of MLflow

MLflow provides many more advanced capabilities for tracking experiments, visualizing metrics, organizing models and continuous deployment.

Experiment Tracking UI

The MLflow tracking UI provides visualization of model metrics like a model version comparison:

MLflow Tracking UI

Querying Model Registry

We can query and fetch models programmatically from the registry:

from mlflow.tracking import MlflowClient

client = MlflowClient()
model_version_details = client.get_model_version(name="cnn-fashion-model", version=1)

Model Versioning

We can also log multiple versions of a model and fetch a specific one:

mlflow.tensorflow.log_model(model, artifact_path, registered_model_name="cnn-fashion-model", version=version)

model = mlflow.pyfunc.load_model("cnn-fashion-model", version=5)

Autologging Metrics

Enable automated logging from model code without manual tags:

mlflow.tensorflow.autolog(log_models=True)  

There are many more capabilities like artifacts, A/B tests, model stages that make MLflow an enterprise-ready and scalable end-to-end MLOps solution.

Conclusion

In this guide, we went over the workflow of training machine learning models, saving them in the portable MLflow format, and deploying them to production. The self-contained artifacts allow packaging models from any experiment run and serve them via REST API or batch inference.

MLflow lowers the barriers for collaborating on modeling experiments across teams and reliably reproducing work to audit or improve models. By containerizing models and serving them in a standardized format, MLflow enables scalable deployment on diverse infrastructures like laptops to Kubernetes clusters!

It truly accelerates the path from ideation to productionization for machine learning.

Similar Posts