dolt vs Apache Calcite

Apache Calcite

Apache Calcite (by apache)

Projects Database Geospatial Calcite Java Big Data Hadoop SQL

calcite.apache.org

Our great sponsors

WorkOS - The modern identity platform for B2B SaaS

InfluxDB - Power Real-Time Data Analytics at Scale

SaaSHub - Software Alternatives and Reviews

Our great sponsors

dolt		Apache Calcite
	Project
93	Mentions	28
16,971	Stars	4,363
2.9%	Growth	2.1%
10.0	Activity	9.0
3 days ago	Latest Commit	2 days ago
Go	Language	Java
Apache License 2.0	License	Apache License 2.0

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

dolt

Posts with mentions or reviews of dolt. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-04-09.

A MySQL compatible database engine written in pure Go
10 projects | news.ycombinator.com | 9 Apr 2024

Hi, this is my project :)
For us this package is most important as the query engine that powers Dolt:
https://github.com/dolthub/dolt
We aren't the original authors but have contributed the vast majority of its code at this point. Here's the origin story if you're interested:
https://www.dolthub.com/blog/2020-05-04-adopting-go-mysql-se...
The Great Migration from MongoDB to PostgreSQL
1 project | news.ycombinator.com | 29 Mar 2024

It's a pretty good default stance, yeah.
We have been trying to convince people to use our new database [1] for several years and it's an uphill battle, because Postgres really is the best choice for most people. They really have to need our unique feature (version control) to even consider it over Postgres, and I don't blame them.
[1] https://github.com/dolthub/dolt
What I Talk About When I Talk About Query Optimizer (Part 1): IR Design
7 projects | news.ycombinator.com | 29 Jan 2024

We implemented a query optimizer with a flexible intermediate representation in pure Go:
https://github.com/dolthub/go-mysql-server
Getting the IR correct so that it's both easy to use and flexible enough to be useful is a really interesting design challenge. Our primary abstraction in the query plan is called a Node, and is way more general than the IR type described in the article from OP. This has probably hurt us: we only recently separated the responsibility to fetch rows into its own part of the runtime, out of the IR -- originally row fetching was coupled to the Node type directly.
This is also the query engine that Dolt uses:
https://github.com/dolthub/dolt
But it has a plug-in architecture, so you can use the engine on any data source that implements a handful of Go interface.
Dolt – Git for Data
1 project | news.ycombinator.com | 18 Jan 2024
Dolt: A version-controlled SQL database
1 project | news.ycombinator.com | 5 Jan 2024
Show HN: DoltgreSQL – Version-Controlled Database, Like Git and PostgreSQL
7 projects | news.ycombinator.com | 1 Nov 2023

Just want to point out that we're announcing development on the project. It's absolutely not ready for mainstream use yet! We have Dolt (https://github.com/dolthub/dolt) which is production-ready and widely in use, but it uses MySQL's syntax and wire protocol. We are building the Dolt equivalent for PostgreSQL, which is DoltgreSQL, but it's only pre-alpha.
Pg_branch: Pre-alpha Postgres extension brings Neon-like branching
6 projects | news.ycombinator.com | 1 Oct 2023

Interesting that branching is now better supported and almost free. I wonder if merging can be simplified or whether it already is as simple and as fast as it can be?
I guess I am inspired by Dolt’s ability to branch and merge: https://github.com/dolthub/dolt
SQLedge: Replicate Postgres to SQLite on the Edge
9 projects | news.ycombinator.com | 9 Aug 2023

#. SQLite WAL mode
From https://www.sqlite.org/isolation.html https://news.ycombinator.com/item?id=32247085 :
> [sqlite] WAL mode permits simultaneous readers and writers. It can do this because changes do not overwrite the original database file, but rather go into the separate write-ahead log file. That means that readers can continue to read the old, original, unaltered content from the original database file at the same time that the writer is appending to the write-ahead log
#. superfly/litefs: aFUSE-based file system for replicating SQLite https://github.com/superfly/litefs
#. sqldiff: https://www.sqlite.org/sqldiff.html https://news.ycombinator.com/item?id=31265005
#. dolthub/dolt: https://github.com/dolthub/dolt
> Dolt can be set up as a replica of your existing MySQL or MariaDB database using standard MySQL binlog replication. Every write becomes a Dolt commit. This is a great way to get the version control benefits of Dolt and keep an existing MySQL or MariaDB database.
#. pganalyze/libpg_query: https://github.com/pganalyze/libpg_query :
> C library for accessing the PostgreSQL parser outside of the server environment
#. Ibis + Substrait [ + DuckDB ]
> ibis strives to provide a consistent interface for interacting with a multitude of different analytical execution engines, most of which (but not all) speak some dialect of SQL.
> Today, Ibis accomplishes this with a lot of help from `sqlalchemy` and `sqlglot` to handle differences in dialect, or we interact directly with available Python bindings (for instance with the pandas, datafusion, and polars backends).
> [...] `Substrait` is a new cross-language serialization format for communicating (among other things) query plans. It's still in its early days, but there is already nascent support for Substrait in Apache Arrow, DuckDB, and Velox.
#. benbjohnson/postlite: https://github.com/benbjohnson/postlite
> postlite is a network proxy to allow access to remote SQLite databases over the Postgres wire protocol. This allows GUI tools to be used on remote SQLite databases which can make administration easier.
> The proxy works by translating Postgres frontend wire messages into SQLite transactions and converting results back into Postgres response wire messages. Many Postgres clients also inspect the pg_catalog to determine system information so Postlite mirrors this catalog by using an attached in-memory database with virtual tables. The proxy also performs minor rewriting on these system queries to convert them to usable SQLite syntax.
> Note: This software is in alpha. Please report bugs. Postlite doesn't alter your database unless you issue INSERT, UPDATE, DELETE commands so it's probably safe. If anything, the Postlite process may die but it shouldn't affect your database.
#. > "Hosting SQLite Databases on GitHub Pages" (2021) re: sql.js-httpvfs, DuckDB https://news.ycombinator.com/item?id=28021766
#. awesome-db-tools https://github.com/mgramin/awesome-db-tools
How do you sync dev databases across multiple devices?
2 projects | /r/PHP | 9 May 2023
Ask HN: Data Management for AI Training
3 projects | news.ycombinator.com | 30 Apr 2023

If you are just looking for data versioning there is Dolt:
https://github.com/dolthub/dolt
And that has a user-friendly UI in DoltHub:
https://www.dolthub.com/
You wouldn't store the images themselves in Dolt, those would likely be links to S3 but al the labels and surrounding metadata could be stored in Dolt?
DISCLAIMER: I'm the CEO of DoltHub so this is self-promotion.

Apache Calcite

Posts with mentions or reviews of Apache Calcite. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-07-26.

Data diffs: Algorithms for explaining what changed in a dataset (2022)
8 projects | news.ycombinator.com | 26 Jul 2023

> Make diff work on more than just SQLite.
Another way of doing this that I've been wanting to do for a while is to implement the DIFF operator in Apache Calcite[0]. Using Calcite, DIFF could be implemented as rewrite rules to generate the appropriate SQL to be directly executed against the database or the DIFF operator can be implemented outside of the database (which the original paper shows is more efficient).
[0] https://calcite.apache.org/
Apache Baremaps: online maps toolkit
6 projects | news.ycombinator.com | 28 May 2023

Yes, planetiler rocks and the memory mapped collections enabled us to remove our dependency to rocksdb.
From my perspective, planetiler started as an effort to generate vector tiles from the OpenMapTile schema as fast as possible (pbf -> mvt). By contrast, Baremaps started as an effort to create a new schema and style from the ground up. In this regard, having a database (pbf -> db <- mvt) enables to live reload changes made in the configuration files. The database has a cost, but also comes with additional advantages (updates, dynamic data, generation of tiles at zoom levels 16+, etc.).
That being said, I think the two projects overlap and I hope we will find opportunities to collaborate in the future. For instance, whereas PostgreSQL is still required in Baremaps, I recently ported a lot of the ST_ function of Postgis to Apache Calcite with the intent to execute SQL on fast memory mapped collection.
https://github.com/apache/calcite/blob/main/core/src/main/ja...
A planet wide import in Postgis currently takes about 4 hours with the COPY API (easy to parallelize) followed by about 12 hours of simplification in Postgis (not easy to parallelize). I will try to publish a detailed benchmark in the future.
How to manipulate SQL string programmatically?
2 projects | /r/dataengineering | 28 Apr 2023

Use a SQL Parser like sqlglot or Apache Calcite to compile user's query into an AST.
Can SQL be used without an RDBMS?
7 projects | /r/PHP | 27 Feb 2023
Apache Calcite
1 project | news.ycombinator.com | 13 Feb 2023
Want to contribute more to open source projects.
8 projects | /r/dotnet | 18 Aug 2022
CITIC Industrial Cloud — Apache ShardingSphere Enterprise Applications
1 project | dev.to | 14 Apr 2022

The SQL Federation engine contains processes such as SQL Parser, SQL Binder, SQL Optimizer, Data Fetcher and Operator Calculator, suitable for dealing with co-related queries and subqueries cross multiple database instances. At the underlying layer, it uses Calcite to implement RBO (Rule Based Optimizer) and CBO (Cost Based Optimizer) based on relational algebra, and query the results through the optimal execution plan.
Postgres wire compatible SQLite proxy
14 projects | news.ycombinator.com | 31 Mar 2022

Awesome to see work in the DB wire compatible space. On the MySQL side, there was MySQL Proxy (https://github.com/mysql/mysql-proxy), which was scriptable with Lua, with which you could create your own MySQL wire compatible connections. Unfortunately it appears to have been abandoned by Oracle and IIRC doesn't work with 5.7 and beyond. I used it in the past to hack together a MySQL wire adapter for Interana (https://scuba.io/).
I guess these days the best approach for connecting arbitrary data sources to existing drivers, at least for OLAP, is Apache Calcite (https://calcite.apache.org/). Unfortunately that feels a little more involved.
Launch HN: Hydra (YC W22) – Query Any Database via Postgres
4 projects | news.ycombinator.com | 23 Feb 2022

For anyone interested, Apache Calcite[0] is an open source data management framework which seems to do many of the same things that Hydra claims to do, but taking a different approach. Operating as a Java library, Calcite contains "adapters" to many different data sources from existing JDBC connectors to Elasticsearch to Cassandra. All of these different data sources can be joined together as desired. Calcite also has it's own optimizer which is able to push down relevant parts of the query to the different data sources. However, you get full SQL on data sources which don't support it, with Calcite executing the remaining bits itself.
Unfortunately, I would not be too surprised if Calcite was found to be less performance-optimized than Hydra. That said, there are users of Calcite at Google, Uber, Spotify, and others who have made great use of various parts of the framework.
[0] https://calcite.apache.org/
Anyone know of any software that can help in designing then outputting to various database
1 project | /r/DatabaseHelp | 21 Nov 2021

Abstraction Layer - You can use something like Calcite to abstract out your data storage. https://calcite.apache.org/

What are some alternatives?

When comparing dolt and Apache Calcite you can also consider the following projects:

liquibase - Main Liquibase Source

Trino - Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)

absurd-sql - sqlite3 in ur indexeddb (hopefully a better backend soon)

ANTLR - ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files.

noms - The versioned, forkable, syncable database

Presto - The official home of the Presto distributed SQL query engine for big data

TimescaleDB - An open-source time-series SQL database optimized for fast ingest and complex queries. Packaged as a PostgreSQL extension.

JSqlParser - JSqlParser parses an SQL statement and translate it into a hierarchy of Java classes. The generated hierarchy can be navigated using the Visitor Pattern

vitess - Vitess is a database clustering system for horizontal scaling of MySQL.

Apache Spark - Apache Spark - A unified analytics engine for large-scale data processing

temporal_tables - Temporal Tables PostgreSQL Extension

Apache Drill - Apache Drill is a distributed MPP query layer for self describing data

dolt vs liquibase Apache Calcite vs Trino dolt vs absurd-sql Apache Calcite vs ANTLR dolt vs noms Apache Calcite vs Presto dolt vs TimescaleDB Apache Calcite vs JSqlParser dolt vs vitess Apache Calcite vs Apache Spark dolt vs temporal_tables Apache Calcite vs Apache Drill

Compare dolt vs Apache Calcite and see what are their differences.

dolt

Apache Calcite

dolt

Apache Calcite

What are some alternatives?