MLFlow is an open-source Python framework for experiment tracking. It allows data science teams to store results, artifacts (machine learning models, figures, tables), and metadata in a principled way when executing data pipelines.
The MLFlow plugin for Apache Hamilton includes two sets of features:
- Save and load machine learning models with the
MLFlowModelSaverandMLFlowModelLoadermaterializers - Automatically track data pipeline results in MLFlow with the
MLFlowTracker.
This pairs nicely with the HamiltonTracker and the Apache Hamilton UI which gives you a way to explore your pipeline code, attributes of the artifacts produced, and execution observability.
We're working on better linking Apache Hamilton "projects" with MLFlow "experiments" and runs from both projects.
-
Create a virtual environment and activate it
python -m venv venv && . venv/bin/active -
Install requirements for the Apache Hamilton code
pip install -r requirements.txt -
Explore the notebook
tutorial.ipynb -
Launch the MLFlow user interface to explore results
mlflow ui
- Learn the basics of Apache Hamilton via the
Concepts/documentation section - Visit tryhamilton.dev for an interactive tutorial in your browser
- Visit the DAGWorks blog for more detailed guides