-
If you've found this post helpful, support us with a GitHub Star!
-
Scout Monitoring
Free Django app performance insights with Scout Monitoring. Get Scout setup in minutes, and let us sweat the small stuff. A couple lines in settings.py is all you need to start monitoring your apps. Sign up for our free tier today.
-
MLOps slightly modifies the traditional DevOps CI/CD practice with an additional pipeline called continuous training (CT). The CI/CT/CD pipeline for MLOps involves orchestrating a series of automated steps to streamline the development, training, testing, and deployment of machine learning models. Automating these processes enables efficient model deployment. Standard automation tools include Jenkins, GitLab CI, Travis CI, and GitHub Actions. You will typically set up the MLOps CI/CT/CD pipeline using a trigger for the automation strategy:
-
One of the main reasons teams struggle to build and maintain their MLOps pipelines are vendor specific packaging. As a model is handed off between data science teams, app development teams, and SRE/DevOps teams, the teams are required to repackage the model to work with their unique toolset. This is tedious, and stands in contrast to well adopted development processes where teams have standardized on the use of containers to ensure that project definitions, dependencies, and artifacts are shared in a consistent format. KitOps is a robust and flexible tool that addresses these exact shortcomings in the MLOps pipeline. It packages the entire ML project in an OCI-compliant artifact called a ModelKit. It is uniquely designed with flexible development attributes to accommodate ML workflows. They present more convenient processes for ML development than DevOps pipelines. Some of these benefits include:
-
Experiment tracking tools like MLflow, Weights and Biases, and Neptune.ai provide a pipeline that automatically tracks meta-data and artifacts generated from each experiment you run. Although they have varying features and functionalities, experiment tracking tools provide a systematic structure that handles the iterative model development approach.
-
Experiment tracking tools like MLflow, Weights and Biases, and Neptune.ai provide a pipeline that automatically tracks meta-data and artifacts generated from each experiment you run. Although they have varying features and functionalities, experiment tracking tools provide a systematic structure that handles the iterative model development approach.
-
The meta-data and model artifacts from experiment tracking can contain large amounts of data, such as the training model files, data files, metrics and logs, visualizations, configuration files, checkpoints, etc. In cases where the experiment tool doesn't support data storage, an alternative option is to track the training and validation data versions per experiment. They use remote data storage systems such as S3 buckets, MINIO, Google Cloud Storage, etc., or data versioning tools like data version control (DVC) or Git LFS (Large File Storage) to version and persist the data. These options facilitate collaboration but have artifact-model traceability, storage costs, and data privacy implications.
-
The meta-data and model artifacts from experiment tracking can contain large amounts of data, such as the training model files, data files, metrics and logs, visualizations, configuration files, checkpoints, etc. In cases where the experiment tool doesn't support data storage, an alternative option is to track the training and validation data versions per experiment. They use remote data storage systems such as S3 buckets, MINIO, Google Cloud Storage, etc., or data versioning tools like data version control (DVC) or Git LFS (Large File Storage) to version and persist the data. These options facilitate collaboration but have artifact-model traceability, storage costs, and data privacy implications.
-
InfluxDB
Purpose built for real-time analytics at any scale. InfluxDB Platform is powered by columnar analytics, optimized for cost-efficient storage, and built with open data standards.