Dolt is Git for Data: a SQL database that you can fork, clone, branch, merge

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
  • dolt

    Dolt – Git for Data

  • git-bug

    Distributed, offline-first bug tracker embedded in git, with bridges

    It might not be exactly what you are looking for, but git-bug[1] is encoding data into regular git objects, with merges and conflict resolution. I'm mentioning this because the hard part is providing an ordering of events. Once you have that you can store and recreate whatever state you want.

    This branch[2] I'm almost done with remove the purely linear branch constraint and allow to use full DAGs (that is, concurrent edition) and still provide a good ordering.

    [1]: https://github.com/MichaelMure/git-bug

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

  • sirdb

    Discontinued :man: a simple, git diffable JSON database on yer filesystem. By the power of NodeJS [Moved to: https://github.com/dosyago/sirdb] (by 00000o1)

    I find a balance between this using git on JSON files. And I build the JSON files into a database (1 file per record, 1 directory per table, subdirectories for indexes). The whole thing is pretty beautiful, and it's functioning well for a user-account, access management database I'm running in production. I like that I can go back and do:

    `git diff -p` to see the users who have signed up recently, for example.

    You can get the code, over at: https://github.com/i5ik/sirdb

    The advantages of this approach are using existing unix tooling for text files, solid versioning, easy inspect-ability, and leveraging the filesystem B-Tree indexing as a fast index structure (rather than having to write my b-trees). Another advantage is hardware-linked scaling. For example, if I use regular hard disks, it's slower. But if I use SSDs it's faster. And i should also be possible to mount the DB as a RAM disk and make it super fast.

    The disadvantages are that the database side still only supports a couple of operations (like exact, multikey searches, lookup by ID, and so on) rather than a rich query language. I'm OK with that for now, and I'm also thinking of using skiplists in future to get nice ordering property for the keys in an index so I can easily iterate and page over those.

  • noms

    Discontinued The versioned, forkable, syncable database

    Noms might be what you’re looking for (https://github.com/attic-labs/noms). Dolt is actually a fork of Noms.

  • nessie

    Nessie: Transactional Catalog for Data Lakes with Git-like semantics

    I've been working on https://github.com/projectnessie/nessie for about a year now. Its similar to Dolt in spirit but aimed at big data/data lakes. Would welcome feedback from the community.

    Its very exciting to see this field picking up speed. Tons of interesting problems to be solved :-)

  • terminusdb

    TerminusDB is a distributed database with a collaboration model

    thanks for the mention of TerminusDB (https://github.com/terminusdb/terminusdb) -> we are the graph cousins of Dolt. Great to see so much energy in the version control database world!

    We are currently focused on data mesh use cases. Rather than trying to be GitHub for Data, we're trying to be YOUR GitHub for Data. Get all that good git lineage, pull, push, clone etc. and have data producers in your org own their data.

    We see lots of organizations with big 'shadow data' problems and data being centrally managed rather than curated by domain experts.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts