πŸ¦‰ ML Experiments and Data Management with Git (by iterative)

Dvc Alternatives

Similar projects and alternatives to dvc

  • Pandas

    402 dvc VS Pandas

    Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more

  • Scout Monitoring

    Free Django app performance insights with Scout Monitoring. Get Scout setup in minutes, and let us sweat the small stuff. A couple lines in settings.py is all you need to start monitoring your apps. Sign up for our free tier today.

    Scout Monitoring logo
  • gitignore

    A collection of useful .gitignore templates

  • OpenCV

    200 dvc VS OpenCV

    Open Source Computer Vision Library

  • datasette

    189 dvc VS datasette

    An open source multi-tool for exploring and publishing data

  • Airflow

    173 dvc VS Airflow

    Apache Airflow - A platform to programmatically author, schedule, and monitor workflows

  • git-lfs

    Git extension for versioning large files

  • ploomber

    121 dvc VS ploomber

    The fastest ⚑️ way to build data pipelines. Develop iteratively, deploy anywhere. ☁️

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • dolt

    94 dvc VS dolt

    Dolt – Git for Data

  • Flyway

    84 dvc VS Flyway

    Flyway by Redgate β€’ Database Migrations Made Easy.

  • delta

    71 dvc VS delta

    An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs (by delta-io)

  • MLflow

    61 dvc VS MLflow

    Open source platform for the machine learning lifecycle

  • Activeloop Hub

    Discontinued Data Lake for Deep Learning. Build, manage, query, version, & visualize datasets. Stream data real-time to PyTorch/TensorFlow. https://activeloop.ai [Moved to: https://github.com/activeloopai/deeplake] (by activeloopai)

  • metaflow

    25 dvc VS metaflow

    :rocket: Build and manage real-life ML, AI, and data science projects with ease!

  • VFSForGit

    Virtual File System for Git: Enable Git at Enterprise Scale

  • EdenSCM

    23 dvc VS EdenSCM

    Discontinued A Scalable, User-Friendly Source Control System. [Moved to: https://github.com/facebook/sapling]

  • oxen-release

    Lightning fast data version control system for structured and unstructured machine learning datasets. We aim to make versioning datasets as easy as versioning code.

  • guildai

    16 dvc VS guildai

    Experiment tracking, ML developer tools

  • dud

    A lightweight CLI tool for versioning data alongside source code and building data pipelines.

  • spock

    12 dvc VS spock

    spock is a framework that helps manage complex parameter configurations during research and development of Python applications (by fidelity)

  • lakeFS

    lakeFS - Data version control for your data lake | Git for data

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a better dvc alternative or higher similarity.

dvc discussion

Log in or Post with
  1. User avatar
    Β· 25 days ago
    Β· Reply

    Review β˜…β˜…β˜…β˜†β˜† 6/10

dvc reviews and mentions

Posts with mentions or reviews of dvc. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-07-11.
  • 25 Open Source AI Tools to Cut Your Development Time in Half
    8 projects | dev.to | 11 Jul 2024
    Implementing version control for machine learning projects entails managing both code and the datasets, ML models, performance metrics, and other development-related artifacts. Its purpose is to bring the best practices from software engineering, like version control and reproducibility, to the world of data science and machine learning. DVC enables data scientists and ML engineers to track changes to data and models like Git does for code, making it able to run on top of any Git repository. It enables the management of model experiments.
  • Essential Deep Learning Checklist: Best Practices Unveiled
    20 projects | dev.to | 17 Jun 2024
    Tool: Consider using Data Version Control (DVC) to manage your datasets, models, and their respective versions. DVC integrates with Git, allowing you to handle large data files and model binaries without cluttering your repository. It also makes it easy to version your training datasets and models, ensuring you can always match a model back to its exact training environment.
  • 10 Open Source Tools for Building MLOps Pipelines
    9 projects | dev.to | 6 Jun 2024
    As Git helps you with code versions and the ability to roll back to previous versions for code repositories, DVC has built-in support for tracking your data and model. This helps machine learning teams reproduce the experiments run by your fellows and facilitates collaboration. DVC is based on the principles of Git and is easy to learn since the commands are similar to those of Git. Other benefits of using DVC include:
  • A step-by-step guide to building an MLOps pipeline
    7 projects | dev.to | 4 Jun 2024
    The meta-data and model artifacts from experiment tracking can contain large amounts of data, such as the training model files, data files, metrics and logs, visualizations, configuration files, checkpoints, etc. In cases where the experiment tool doesn't support data storage, an alternative option is to track the training and validation data versions per experiment. They use remote data storage systems such as S3 buckets, MINIO, Google Cloud Storage, etc., or data versioning tools like data version control (DVC) or Git LFS (Large File Storage) to version and persist the data. These options facilitate collaboration but have artifact-model traceability, storage costs, and data privacy implications.
  • AI Strategy Guide: How to Scale AI Across Your Business
    4 projects | dev.to | 11 May 2024
    Level 1 of MLOps is when you've put each lifecycle stage and their intefaces in an automated pipeline. The pipeline could be a python or bash script, or it could be a directed acyclic graph run by some orchestration framework like Airflow, dagster or one of the cloud-provider offerings. AI- or data-specific platforms like MLflow, ClearML and dvc also feature pipeline capabilities.
  • My Favorite DevTools to Build AI/ML Applications!
    9 projects | dev.to | 23 Apr 2024
    Collaboration and version control are crucial in AI/ML development projects due to the iterative nature of model development and the need for reproducibility. GitHub is the leading platform for source code management, allowing teams to collaborate on code, track issues, and manage project milestones. DVC (Data Version Control) complements Git by handling large data files, data sets, and machine learning models that Git can't manage effectively, enabling version control for the data and model files used in AI projects.
  • Why bad scientific code beats code following "best practices"
    3 projects | news.ycombinator.com | 6 Jan 2024
    What you’re describing sounds like DVC (at a higher-ishβ€”80%-solution level).


    See pachyderm too.

  • First 15 Open Source Advent projects
    16 projects | dev.to | 15 Dec 2023
    10. DVC by Iterative | Github | tutorial
  • Exploring Open-Source Alternatives to Landing AI for Robust MLOps
    18 projects | dev.to | 13 Dec 2023
    Platforms such as MLflow monitor the development stages of machine learning models. In parallel, Data Version Control (DVC) brings version control system-like functions to the realm of data sets and models.
  • ML Experiments Management with Git
    4 projects | news.ycombinator.com | 2 Nov 2023
  • A note from our sponsor - SaaSHub
    www.saashub.com | 20 Jul 2024
    SaaSHub helps you find the best software and product alternatives Learn more β†’


Basic dvc repo stats
5 days ago

iterative/dvc is an open source project licensed under Apache License 2.0 which is an OSI approved license.

The primary programming language of dvc is Python.

Free Django app performance insights with Scout Monitoring
Get Scout setup in minutes, and let us sweat the small stuff. A couple lines in settings.py is all you need to start monitoring your apps. Sign up for our free tier today.