MLflow is an critical open-source library for managing machine learning lifecycle including experimentation, reproducibility and model deployment. One of its most important components is MLflow Models – a standardized packaging format for persisting models and deploying them to production.
In this comprehensive guide, we‘ll cover the end-to-end workflow of training, saving and deploying a model with MLflow‘s features:
Overview of MLflow Components
MLflow consists of 4 main components:
MLflow Tracking: Records and tracks model experiments by logging parameters, metrics, artifacts, etc during runs. Integrates with TensorBoard for visualization.
MLflow Projects: Packages data science code in a reproducible format to run experiments locally or on cloud platforms like SageMaker.
MLflow Models: Saves trained models in a portable, reusable format with metadata and environment specifications for deployment.
Model Registry: Central model store for organizing, tracking experiments and registering models for production.
Let‘s learn how each piece fits together in the machine learning lifecycle.
Saving Models with mlflow.log_model API
The key functionality MLflow provides is the ability to save models in a reusable, portable format called MLflow Models.
This self-contained packaging structure enables exporting models and deploying them on different downstream platforms. MLflow Models contain:
- Model Files: Serialized files representing the actual trained algorithm like estimator, coefficients
- MLmodel File: Metadata on the model like name, version, framework info
- Conda Environment: Conda YAML listing model‘s software dependencies
- Signature: Input and output schema of the model
We can save models programatically in this format using the mlflow.log_model() method:
import mlflow
mlflow.log_model(model, artifact_path="model", conda_env="conda.yaml")
The model can be a scikit-learn model, PyTorch model etc. MLflow will infer the flavor and serialize the model accordingly.
Model Flavors
MLflow supports saving models in different "flavors" like:
- Python Function (pyfunc): General Python models
- TensorFlow: tf.Estimator, Keras models
- PyTorch: PyTorch models
- scikit-learn: Scikit-learn models
- spaCy: spaCy NLP models
- R: R models with CRAN format
Based on the framework, MLflow handles the serialization automatically when logging models.
Let‘s see an example training and saving a PyTorch model:
import mlflow
import mlflow.pytorch
import torch
# Train PyTorch model
model = Net()
model.train()
# Save model
mlflow.pytorch.log_model(model, artifact_path="pytorch-model")
This handles serializing the PyTorch model format into the MLflow artifact directory.
We can also save models without a specific flavor as a "python function" flavor and call predictions generally:
import mlflow.pyfunc
mlflow.pyfunc.log_model(model, artifact_path="model", loader_module=loader_module)
Now let‘s go through a full use case of training, evaluate and saving machine learning models step-by-step using MLflow!
End-to-End Example: Classifying Images with Convolutional Neural Network (CNN)
To demonstrate the full model management lifecycle with MLflow, we‘ll work through an image classification example using Keras and Tensorflow. The goal is to train a CNN to classify fashion images into categories.
The steps we‘ll cover are:
- Installing MLflow
- Coding CNN model with Keras
- Training and evaluating model
- Saving and logging metrics with MLflow
- Registering model with registry
- Deploying model to new environment
1. Install MLflow
pip install mlflow
2. Build & Train Keras CNN Model
First, we‘ll train a convolutional neural network on the Fashion MNIST image dataset using Keras:
import keras
from keras.layers import Conv2D, MaxPooling2D, Flatten, Dense
# Build 3 layer CNN model
model = keras.models.Sequential()
model.add(Conv2D(32, 3, activation=‘relu‘, input_shape=(28,28,1)))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(10, activation=‘softmax‘))
# Train model on fashion image data
model.compile(loss="categorical_crossentropy", optimizer="adam", metrics=["accuracy"])
model.fit(train_images, train_labels, epochs=5, validation_data=(test_images, test_labels))
This trains a CNN for classifying clothing types.
3. Evaluate Model Metrics
Now we can evaluate the model accuracy on test fashion images:
test_loss, test_accuracy = model.evaluate(test_images, test_labels)
print("Test loss: {}, Test accuracy: {}".format(test_loss, test_accuracy))
This gives us metrics like Test loss: 0.52, Test accuracy: 0.87
4. Log MLflow Run
We‘ll log the model, parameters and metrics using an MLflow run:
import mlflow
import mlflow.tensorflow
with mlflow.start_run() as run:
# Log Keras CNN model
mlflow.tensorflow.log_model(model, artifact_path="cnn-fashion")
# Log params and metrics
mlflow.log_param("epochs", 5)
mlflow.log_metric("test_loss", test_loss)
mlflow.log_metric("test_accuracy", test_accuracy)
run_id = run.info.run_uuid
This will save the model as a MLflow Model artifact that can be deployed later.
5. Register Model with Registry
Next, we can add the model to the MLflow Model Registry:
mlflow.register_model(f"runs:/{run_id}/cnn-fashion", "cnn-fashion-model")
Now this model can be tracked, audited, approved and deployed from the registry.
6. Deploy Model to Production
Finally, we can export the saved model in MLflow format and deploy it to a production environment using TensorFlow Serving docker container:
mlflow models serve -m "./cnn-fashion" -p 1234
docker run -p 8501:8501 tfserving-model
This deploys the CNN model as a performant REST endpoint accessible from the tf serving container for real-time prediction requests!
This demonstrates the full life cycle of an ML model from training to persistence to deployment leveraging MLflow!
Benefits of Standardized Model Format
The portability of MLflow‘s model packaging provides many advantages:
Version Control: Can save, update, revert model versions throughout experiments.
Reproducibility: All model artifacts, parameters captured to rerun training experiments.
Organization: Model Registry enables easily tracking experiments across teams.
Deployment: Package models and serve them via Docker, REST APIs in a portable manner.
CI/CD Pipelines: Integration with Jenkins, Airflow etc to enable Continuous Training, Deployment workflows.
Alternatives to MLflow for Model Persistence
There are a few alternatives to MLflow Models for standard model serialization:
ONNX: Open Neural Network Exchange format focused only on neural net models unlike MLflow generic format. Does not include conda environments in model packaging.
PMML: Predictive Model Markup Language records models via XML. Limited framework support mostly for traditional models.
MLflow provides the best of both worlds – a generic packaging format integrated deeply with popular DL frameworks like TensorFlow, PyTorch and scikit-learn for convenience.
Saving Models with Different Frameworks
While MLflow mostly auto-serializes models, we can use utility methods for explicitly saving models in specialized frameworks:
PyTorch
import mlflow.pytorch
mlflow.pytorch.log_model(pytorch_model, artifact_path)
XGBoost
import mlflow.xgboost
mlflow.xgboost.log_model(xgb_model, artifact_path)
SparkML
import mlflow.spark
mlflow.spark.log_model(sparkml_model, artifact_path)
TensorFlow (Keras)
import mlflow.tensorflow
mlflow.tensorflow.autolog(log_models=True)
mlflow.tensorflow.log_model(keras_model, artifact_path)
The benefit here is portability across runtimes. For example, a PyTorch model saved with MLflow can be loaded back as a TensorFlow model for serving!
Advanced Features of MLflow
MLflow provides many more advanced capabilities for tracking experiments, visualizing metrics, organizing models and continuous deployment.
Experiment Tracking UI
The MLflow tracking UI provides visualization of model metrics like a model version comparison:

Querying Model Registry
We can query and fetch models programmatically from the registry:
from mlflow.tracking import MlflowClient
client = MlflowClient()
model_version_details = client.get_model_version(name="cnn-fashion-model", version=1)
Model Versioning
We can also log multiple versions of a model and fetch a specific one:
mlflow.tensorflow.log_model(model, artifact_path, registered_model_name="cnn-fashion-model", version=version)
model = mlflow.pyfunc.load_model("cnn-fashion-model", version=5)
Autologging Metrics
Enable automated logging from model code without manual tags:
mlflow.tensorflow.autolog(log_models=True)
There are many more capabilities like artifacts, A/B tests, model stages that make MLflow an enterprise-ready and scalable end-to-end MLOps solution.
Conclusion
In this guide, we went over the workflow of training machine learning models, saving them in the portable MLflow format, and deploying them to production. The self-contained artifacts allow packaging models from any experiment run and serve them via REST API or batch inference.
MLflow lowers the barriers for collaborating on modeling experiments across teams and reliably reproducing work to audit or improve models. By containerizing models and serving them in a standardized format, MLflow enables scalable deployment on diverse infrastructures like laptops to Kubernetes clusters!
It truly accelerates the path from ideation to productionization for machine learning.


