Python data-version-control

Open-source Python projects categorized as data-version-control

Top 4 Python data-version-control Projects

  • dvc

    🦉 ML Experiments and Data Management with Git

  • Project mention: Why bad scientific code beats code following "best practices" | news.ycombinator.com | 2024-01-06

    What you’re describing sounds like DVC (at a higher-ish—80%-solution level).

    https://dvc.org/

    See pachyderm too.

  • deeplake

    Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time to PyTorch/TensorFlow. https://activeloop.ai

  • Project mention: FLaNK AI Weekly 25 March 2025 | dev.to | 2024-03-25
  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • sgr

    sgr (command line client for Splitgraph) and the splitgraph Python library

  • Project mention: Show HN: Loofi – Our AI-Powered SQL Query Builder | news.ycombinator.com | 2023-05-21
  • ZnTrack

    Create, visualize, run & benchmark DVC pipelines in Python & Jupyter notebooks.

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python data-version-control related posts

Index

What are some of the best open-source data-version-control projects in Python? This list will help you:

Project Stars
1 dvc 13,093
2 deeplake 7,690
3 sgr 326
4 ZnTrack 41

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com