delta VS dvc

Compare delta vs dvc and see what are their differences.

delta

An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs (by delta-io)
InfluxDB – Built for High-Performance Time Series Workloads
InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
delta dvc
74 121
8,023 14,454
1.4% 1.0%
9.8 8.9
3 days ago 5 days ago
Scala Python
Apache License 2.0 Apache License 2.0
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

delta

Posts with mentions or reviews of delta. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2025-04-10.
  • Twitter's 600-Tweet Daily Limit Crisis: Soaring GCP Costs and the Open Source Fix Elon Musk Ignored
    15 projects | dev.to | 10 Apr 2025
    Delta Lake: Delta Lake is an open-source storage layer that provides ACID transactions, scalable metadata management, and data versioning on top of existing data lakes. It aims to bring reliability and performance optimizations to big data workloads while ensuring data integrity and consistency.
  • Stream Processing Systems in 2025: RisingWave, Flink, Spark Streaming, and What's Ahead
    7 projects | dev.to | 27 Jan 2025
    When it comes to stream processing systems, Iceberg support varies across vendors. Databricks, which oversees Spark Streaming, focuses on Delta Lake. Apache Flink, heavily influenced by Alibaba’s contributions, promotes Paimon, an alternative to Iceberg. RisingWave, on the other hand, fully embraces Iceberg. Rather than focusing solely on one table format, RisingWave aims to support various catalog services, including AWS Glue Catalog, Polaris, and Unity Catalog.
  • Apache Iceberg
    7 projects | news.ycombinator.com | 25 Jan 2025
    Hidden partitioning is the most interesting Iceberg feature, because most of the very large datasets are timeseries fact tables.

    I don't remember seeing that in Delta Lake [1], which is probably because the industry standard benchmarks join date as a dimension table and do not use timestamp ranges instead of dates.

    [1] - https://github.com/delta-io/delta/issues/490

  • 25 Open Source AI Tools to Cut Your Development Time in Half
    8 projects | dev.to | 11 Jul 2024
    Delta Lake is a storage layer framework that provides reliability to data lakes. It addresses the challenges of managing large-scale data in lakehouse architectures, where data is stored in an open format and used for various purposes, like machine learning (ML). Data engineers can build real-time pipelines or ML applications using Delta Lake because it supports both batch and streaming data processing. It also brings ACID (atomicity, consistency, isolation, durability) transactions to data lakes, ensuring data integrity even with concurrent reads and writes from multiple pipelines.
  • Make Rust Object Oriented with the dual-trait pattern
    2 projects | dev.to | 8 Jul 2024
    There is a neat example, of how a third party project belonging to the Linux Foundation, is implementing UserDefinedLogicalNodeCore: MetricObserver in delta-rs. The developer had to use only #[derive(Debug, Hash, Eq, PartialEq)] to get dyn_eq and dyn_hash implemented.
  • Delta Lake vs. Parquet: A Comparison
    2 projects | news.ycombinator.com | 19 Jan 2024
    Delta is pretty great, let's you do upserts into tables in DataBricks much easier than without it.

    I think the website is here: https://delta.io

  • Understanding Parquet, Iceberg and Data Lakehouses
    4 projects | news.ycombinator.com | 29 Dec 2023
    I often hear references to Apache Iceberg and Delta Lake as if they’re two peas in the Open Table Formats pod. Yet…

    Here’s the Apache Iceberg table format specification:

    https://iceberg.apache.org/spec/

    As they like to say in patent law, anyone “skilled in the art” of database systems could use this to build and query Iceberg tables without too much difficulty.

    This is nominally the Delta Lake equivalent:

    https://github.com/delta-io/delta/blob/master/PROTOCOL.md

    I defy anyone to even scope out what level of effort would be required to fully implement the current spec, let alone what would be involved in keeping up to date as this beast evolves.

    Frankly, the Delta Lake spec reads like a reverse engineering of whatever implementation tradeoffs Databricks is making as they race to build out a lakehouse for every Fortune 1000 company burned by Hadoop (which is to say, most of them).

    My point is that I’ve yet to be convinced that buying into Delta Lake is actually buying into an open ecosystem. Would appreciate any reassurance on this front!

  • Getting Started with Flink SQL, Apache Iceberg and DynamoDB Catalog
    4 projects | dev.to | 18 Dec 2023
    Apache Iceberg is one of the three types of lakehouse, the other two are Apache Hudi and Delta Lake.
  • [D] Is there other better data format for LLM to generate structured data?
    1 project | /r/MachineLearning | 10 Dec 2023
    The Apache Spark / Databricks community prefers Apache parquet or Linux Fundation's delta.io over json.
  • Delta vs Iceberg: make love not war
    1 project | /r/MicrosoftFabric | 30 Jun 2023
    Delta 3.0 extends an olive branch. https://github.com/delta-io/delta/releases/tag/v3.0.0rc1

dvc

Posts with mentions or reviews of dvc. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2025-03-21.

What are some alternatives?

When comparing delta and dvc you can also consider the following projects:

lakeFS - lakeFS - Data version control for your data lake | Git for data

MLflow - Open source platform for the machine learning lifecycle

delta-rs - A native Rust library for Delta Lake, with bindings into Python

LakeSoul - LakeSoul is an end-to-end, realtime and cloud native Lakehouse framework with fast data ingestion, concurrent update and incremental data analytics on cloud storages for both BI and AI applications.

git-lfs - Git extension for versioning large files

InfluxDB – Built for High-Performance Time Series Workloads
InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured