Dolt is Git for Data: a SQL database that you can fork, clone, branch, merge

Our great sponsors

InfluxDB - Power Real-Time Data Analytics at Scale

WorkOS - The modern identity platform for B2B SaaS

SaaSHub - Software Alternatives and Reviews

Our great sponsors

dolt

93 16,924 10.0 Go

Dolt – Git for Data
git-bug

56 7,993 6.3 Go

Distributed, offline-first bug tracker embedded in git, with bridges

It might not be exactly what you are looking for, but git-bug[1] is encoding data into regular git objects, with merges and conflict resolution. I'm mentioning this because the hard part is providing an ordering of events. Once you have that you can store and recreate whatever state you want.
This branch[2] I'm almost done with remove the purely linear branch constraint and allow to use full DAGs (that is, concurrent edition) and still provide a good ordering.
[1]: https://github.com/MichaelMure/git-bug
InfluxDB

www.influxdata.com
sponsored

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
sirdb

4 562 0.0 JavaScript

Discontinued :man: a simple, git diffable JSON database on yer filesystem. By the power of NodeJS [Moved to: https://github.com/dosyago/sirdb] (by 00000o1)

I find a balance between this using git on JSON files. And I build the JSON files into a database (1 file per record, 1 directory per table, subdirectories for indexes). The whole thing is pretty beautiful, and it's functioning well for a user-account, access management database I'm running in production. I like that I can go back and do:
`git diff -p` to see the users who have signed up recently, for example.
You can get the code, over at: https://github.com/i5ik/sirdb
The advantages of this approach are using existing unix tooling for text files, solid versioning, easy inspect-ability, and leveraging the filesystem B-Tree indexing as a fast index structure (rather than having to write my b-trees). Another advantage is hardware-linked scaling. For example, if I use regular hard disks, it's slower. But if I use SSDs it's faster. And i should also be possible to mount the DB as a RAM disk and make it super fast.
The disadvantages are that the database side still only supports a couple of operations (like exact, multikey searches, lookup by ID, and so on) rather than a rich query language. I'm OK with that for now, and I'm also thinking of using skiplists in future to get nice ordering property for the keys in an index so I can easily iterate and page over those.
noms

11 7,502 1.9 Go

Discontinued The versioned, forkable, syncable database

Noms might be what you’re looking for (https://github.com/attic-labs/noms). Dolt is actually a fork of Noms.
nessie

13 822 9.9 Java

Nessie: Transactional Catalog for Data Lakes with Git-like semantics

I've been working on https://github.com/projectnessie/nessie for about a year now. Its similar to Dolt in spirit but aimed at big data/data lakes. Would welcome feedback from the community.
Its very exciting to see this field picking up speed. Tons of interesting problems to be solved :-)
terminusdb

51 2,615 9.1 Prolog

TerminusDB is a distributed database with a collaboration model

thanks for the mention of TerminusDB (https://github.com/terminusdb/terminusdb) -> we are the graph cousins of Dolt. Great to see so much energy in the version control database world!
We are currently focused on data mesh use cases. Rather than trying to be GitHub for Data, we're trying to be YOUR GitHub for Data. Get all that good git lineage, pull, push, clone etc. and have data producers in your org own their data.
We see lots of organizations with big 'shadow data' problems and data being centrally managed rather than curated by domain experts.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Gothub: Alternative front-end for GitHub written with Go
5 projects | news.ycombinator.com | 1 Sep 2023
GitDB, a distributed embeddable database on top of Git
4 projects | news.ycombinator.com | 5 Jul 2022
Git as a Storage
8 projects | news.ycombinator.com | 8 Oct 2021
How to use fly.io and Tigris to deploy a Next.js app
3 projects | dev.to | 2 Apr 2024
Radicle: Peer-to-Peer Collaboration with Git
3 projects | news.ycombinator.com | 30 Mar 2024

Dolt is Git for Data: a SQL database that you can fork, clone, branch, merge

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com
Database Git Prolog Bugtracker Data
Post date: 6 Mar 2021

dolt

git-bug

InfluxDB

sirdb

noms

nessie

terminusdb

Related posts

Dolt is Git for Data: a SQL database that you can fork, clone, branch, merge

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com Database Git Prolog Bugtracker Data Post date: 6 Mar 2021

dolt

git-bug

InfluxDB

sirdb

noms

nessie

terminusdb

Related posts

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com
Database Git Prolog Bugtracker Data
Post date: 6 Mar 2021