MLflow: managing the machine learning lifecycle

Databricks releases MLflow for experiment tracking, model packaging and a centralised registry, treating the ML lifecycle as an engineering problem.

Open SourceAI Open SourceMLflowMLOpsMachine LearningTrackingModels

The gap between research and production

Data science teams in 2018 face a recurring problem: machine learning experiments are difficult to track, reproduce and bring to production. A data scientist trains dozens of models varying hyperparameters, datasets and algorithms, recording results in spreadsheets or notebooks. When a promising model needs to go to production, reconstructing the exact experiment conditions — code version, data used, parameters, environment — is often impossible. MLflow, released by Databricks under the Apache 2.0 licence, is the first open source tool that addresses the machine learning lifecycle as a structured engineering problem.

The project is born from the experience of Matei Zaharia — creator of Apache Spark and co-founder of Databricks — who observes the same problem across dozens of organisations: teams capable of building excellent models but unable to manage them systematically.

Experiment tracking

The MLflow Tracking component provides a standard interface for recording every aspect of an experiment: input parameters, evaluation metrics, produced artefacts (models, charts, transformed datasets) and metadata such as code version and execution environment. APIs are available for Python, R, Java and REST, and work with any framework — scikit-learn, TensorFlow, PyTorch, XGBoost.

Each run is recorded with a unique identifier and results are browsable through a web UI that allows comparing experiments, filtering by metrics and visualising parameter evolution. Recording is explicit: the developer adds mlflow.log_param() and mlflow.log_metric() calls in the training code.

Model packaging

MLflow Models defines a standard format for packaging machine learning models, independent of the framework used for training. A packaged model includes inference code, dependencies, weights and the metadata needed for execution. The format supports multiple “flavours” — a scikit-learn model can be served as a REST endpoint, loaded into Spark for batch inference or executed as a Python function.

Model registry

The Model Registry is a centralised repository that tracks model versions, manages transitions between stages (staging, production, archived) and maintains the complete history of every model. For teams managing dozens of models in production, the registry transforms an informal process into a controlled workflow.

Link: mlflow.org

Need support? Under attack? Service Status
Need support? Under attack? Service Status