Dvc Alternatives

Similar projects and alternatives to dvc

  • MLflow

    dvc VS MLflow

    Open source platform for the machine learning lifecycle

  • guildai

    dvc VS guildai

    Experiment tracking, ML developer tools

  • SonarLint

    Deliver Cleaner and Safer Code - Right in Your IDE of Choice!. SonarLint is a free and open source IDE extension that identifies and catches bugs and vulnerabilities as you code, directly in the IDE. Install from your favorite IDE marketplace today.

  • ploomber

    dvc VS ploomber

    The fastest ⚡️ way to build data pipelines. Develop iteratively, deploy anywhere. ☁️

  • Activeloop Hub

    dvc VS Activeloop Hub

    Dataset format for AI. Build, manage, query & visualize datasets for deep learning. Stream data real-time to PyTorch/TensorFlow & version-control it. https://activeloop.ai (by activeloopai)

  • aim

    dvc VS aim

    Aim 💫 — easy-to-use and performant open-source ML experiment tracker.

  • git-submodules

    Git Submodule alternative with equivalent features, but easier to use and maintain.

  • MLOps

    dvc VS MLOps

    MLOps examples

  • Scout APM

    Less time debugging, more time building. Scout APM allows you to find and fix performance issues with no hassle. Now with error monitoring and external services monitoring, Scout is a developer's best friend when it comes to application development.

  • VFSForGit

    dvc VS VFSForGit

    Virtual File System for Git: Enable Git at Enterprise Scale

  • EdenSCM

    dvc VS EdenSCM

    EdenSCM is a cross-platform, highly scalable source control management system.

  • spock

    dvc VS spock

    spock is a framework that helps manage complex parameter configurations during research and development of Python applications (by fidelity)

  • commitizen

    dvc VS commitizen

    Create committing rules for projects :rocket: auto bump versions :arrow_up: and auto changelog generation :open_file_folder:

  • git-lfs

    dvc VS git-lfs

    Git extension for versioning large files

  • pre-commit

    dvc VS pre-commit

    A framework for managing and maintaining multi-language pre-commit hooks.

  • Flyway

    dvc VS Flyway

    Flyway by Redgate • Database Migrations Made Easy.

  • datasette

    dvc VS datasette

    An open source multi-tool for exploring and publishing data

  • metaflow

    dvc VS metaflow

    :rocket: Build and manage real-life data science projects with ease!

  • lakeFS

    dvc VS lakeFS

    Git-like capabilities for your object storage

  • lowdefy

    dvc VS lowdefy

    An open-source, self-hosted, low-code framework to build internal tools, web apps, admin panels, BI dashboards, workflows, and CRUD apps with YAML or JSON.

  • labml

    dvc VS labml

    🔎 Monitor deep learning model training and hardware usage from your mobile phone 📱

  • nessie

    dvc VS nessie

    Nessie: Transactional Catalog for Data Lakes with Git-like semantics

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a better dvc alternative or higher similarity.

Suggest an alternative to dvc

dvc reviews and mentions

Posts with mentions or reviews of dvc. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2022-04-12.
  • How do you track your experiments?
    1 project | reddit.com/r/deeplearning | 15 Apr 2022
  • Eden
    16 projects | news.ycombinator.com | 12 Apr 2022
    Data can and should be versioned, but not by just `git add BLOAT`. Take a look at https://dvc.org/: blobs are uploaded to a S3 compatible blob storage, metadata is versioned in a config file and this one gets versioned in git
    16 projects | news.ycombinator.com | 12 Apr 2022
  • GitHub for code but where/how do you organize your datafiles?
    2 projects | reddit.com/r/github | 29 Mar 2022
    In the field of machine learning more and more people are using dvc: think of it as a "git" for data with native integration to git itself.
  • How can I have an organized workflow in R?
    2 projects | reddit.com/r/datascience | 8 Mar 2022
    If you have datasets that are constantly updated, you might find DVC (https://github.com/iterative/dvc) helpful.
  • DevOps Fundamentals for Deep Learning Engineers
    6 projects | reddit.com/r/deeplearning | 20 Feb 2022
    MLOps is a HUGE area to explore, and not surprisingly, there are many startups showing up in this space. If you want to get it on the latest trends, then I would look at workflow orchestration frameworks such as Metaflow (started off at Netflix, is now spinning off into its own enterprise business, https://metaflow.org/), Kubeflow (used at Google, https://www.kubeflow.org/), Airflow (used at Airbnb, https://airflow.apache.org/), and Luigi (used at Spotify, https://github.com/spotify/luigi). Then you have the model serving itself, so there is Seldon (https://www.seldon.io/), Torchserve (https://pytorch.org/serve/), and TensorFlow Serving (https://www.tensorflow.org/tfx/guide/serving). You also have the actual export and transfer of DL models, and ONNX is the most popular here (https://onnx.ai/). Spark (https://spark.apache.org/) still holds up nicely after all these years, especially if you are doing batch predictions on massive amount of data. There is also the GitFlow way of doing things and Data Version Control (DVC, https://dvc.org/) is taken a pole position there.
  • Do you guys actually know how to use git?
    2 projects | reddit.com/r/datascience | 12 Feb 2022
    ML teams should also review DVC (refer https://dvc.org/) . Would be useful for code, datasets, and ML models. Becomes a useful tool for ML experiment tracking too.
  • DoltLab v0.2.0
    5 projects | news.ycombinator.com | 11 Feb 2022
  • [N] Experiment tracking with DvC and Guild AI
    2 projects | reddit.com/r/MachineLearning | 8 Feb 2022
    I'm the author of Guild AI (open source experiment tracking). For some time now Guild users have asked for DvC support. This is now available as a pre-release.
  • Data Science Workflows — Notebook to Production
    7 projects | dev.to | 8 Feb 2022
    At DagsHub, we’re integrated with DVC, which I love using. First and foremost, it’s open-source. It provides pipeline capabilities and supports many cloud providers for remote storage. Also, DVC acts as an extension to Git, which allows you to keep using the standard Git flow in your work. If you don’t want to use both tools, I recommend using FDS, an open-source tool that makes version control for machine learning fast & easy. It combines Git and DVC under one roof and takes care of code, data, and model versioning. (Bias alert: DagsHub developed FDS)
    7 projects | dev.to | 8 Feb 2022
    Git was designed for managing software development projects and for versioning text/code files. Therefore, Git doesn’t handle large files. Git released Git LFS (Large File System) to overcome large file versioning, which is better than Git, but fails when scaling. Also, both Git and Git LFS are not optimized for data science workflow. To overcome this challenge, many powerful tools emerged in recent years, such as DVC, Delta Lake, LakeFS, and more.
  • How do i create a single workspace when i work on multiple devices
    1 project | reddit.com/r/bioinformatics | 7 Feb 2022
    In addition to what others have said about using github, you should consider using a data versioning tool like DVC. It's very flexible and allows for syncing with all the main cloud storage servers, as well as direct syncing between devices using ssh. Plus you get all the benefits of having your data versions linked with your code versions in git.
  • [D] Why doesn’t your team use an experiment tracking tool?
    6 projects | reddit.com/r/MachineLearning | 7 Feb 2022
    I've been integrating DVC into our pipeline, from data processing to tracking experiment metrics and hyperparameters. The data versioning works quite well for us. We use a high performance networked volume for storing the cache (AWS FSx for Lustre) and use AWS S3 for perpetual storage of data, models, and other dependencies & outputs. The workspace is hard-linked to the cache.
    6 projects | reddit.com/r/MachineLearning | 7 Feb 2022
    Unfortunately, there are some issues with `dvc exp` --- the set of experiment tracking subcommands. In particular, I rely heavily on git submodules to partition the code that instantiates a model from the code that runs an experiment. But `dvc exp` doesn't work with submodules ATM. (Bug filed here.) This is unfortunate because, if `dvc exp` worked, it would make experiment tracking a little more convenient for us. It's not a deal breaker though. I use git branches to organize individual experiments and tags to organize stages of the same experiment. I use a shared dvc cache so that I can run multiple experiments at a time without using up too much workspace storage.
  • Autodocumenting Makefiles
    7 projects | news.ycombinator.com | 31 Jan 2022
    For data science specifically, I would strongly suggest looking into DVC: https://dvc.org/.

    You can easily write DVC stage files by hand as a straightforward Makefile replacement, and integrate other features into your workflow as needed/desired.

Stats

Basic dvc repo stats
64
9,715
9.8
5 days ago

iterative/dvc is an open source project licensed under Apache License 2.0 which is an OSI approved license.

SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
Find remote Python jobs at our new job board 99remotejobs.com. There is 1 new remote job listed recently.
Are you hiring? Post a new remote job listing for free.