dvc
MLOps
Our great sponsors
dvc | MLOps | |
---|---|---|
108 | 2 | |
13,093 | 1,698 | |
1.3% | 9.8% | |
9.7 | 2.5 | |
4 days ago | 9 months ago | |
Python | Jupyter Notebook | |
Apache License 2.0 | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
dvc
-
Why bad scientific code beats code following "best practices"
What you’re describing sounds like DVC (at a higher-ish—80%-solution level).
See pachyderm too.
-
First 15 Open Source Advent projects
10. DVC by Iterative | Github | tutorial
-
Exploring Open-Source Alternatives to Landing AI for Robust MLOps
Platforms such as MLflow monitor the development stages of machine learning models. In parallel, Data Version Control (DVC) brings version control system-like functions to the realm of data sets and models.
- ML Experiments Management with Git
-
Git Version Controlled Datasets in S3
I was using DVC (https://dvc.org/) for some time to help solve this but it was getting hard to manage the storage connections and I would run into cache issues a lot, but this solves it using git-lfs itself.
- Ask HN: How do your ML teams version datasets and models?
-
Exploring MLOps Tools and Frameworks: Enhancing Machine Learning Operations
DVC (Data Version Control):
- Evaluate and Track Your LLM Experiments: Introducing TruLens for LLMs
-
[D] Is there a tool to keep track of my ML experiments?
I have been using DVC and MLflow since then DVC had only data tracking and MLflow only model tracking. I can say both are awesome now and maybe the only factor I would like to mention is that IMO, MLflow is a bit harder to learn while DVC is just a git practically.
-
Where do I best store my test data when using github for code?
I use DVC, which works decently well and can be hooked into Git.
MLOps
-
Deploying Azure Machine Learning Models to Prod Environments
Walk through this, it shows how to operationalise your ML pipeline https://github.com/Microsoft/MLOps
-
[D] How to maintain ML models?
Maybe something like this: https://github.com/microsoft/MLOps
What are some alternatives?
MLflow - Open source platform for the machine learning lifecycle
lakeFS - lakeFS - Data version control for your data lake | Git for data
mlops-with-vertex-ai - An end-to-end example of MLOps on Google Cloud using TensorFlow, TFX, and Vertex AI
Activeloop Hub - Data Lake for Deep Learning. Build, manage, query, version, & visualize datasets. Stream data real-time to PyTorch/TensorFlow. https://activeloop.ai [Moved to: https://github.com/activeloopai/deeplake]
pytorch-deepdream - PyTorch implementation of DeepDream algorithm (Mordvintsev et al.). Additionally I've included playground.py to help you better understand basic concepts behind the algo.
delta - An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs
mllint - `mllint` is a command-line utility to evaluate the technical quality of Python Machine Learning (ML) projects by means of static analysis of the project's repository.
ploomber - The fastest ⚡️ way to build data pipelines. Develop iteratively, deploy anywhere. ☁️
awesome-seml - A curated list of articles that cover the software engineering best practices for building machine learning applications.
aim - Aim 💫 — An easy-to-use & supercharged open-source experiment tracker.
MachineLearningNotebooks - Python notebooks with ML and deep learning examples with Azure Machine Learning Python SDK | Microsoft