yata
go-ds-crdt
yata | go-ds-crdt | |
---|---|---|
1 | 7 | |
1 | 365 | |
- | 3.0% | |
0.0 | 6.1 | |
almost 2 years ago | 4 months ago | |
Python | Go | |
- | GNU General Public License v3.0 or later |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
yata
-
We Put IPFS in Brave
Does anyone understand merkle CRDTs?
How do you handle conflicts where two concurrent events occur at the same time? Who wins? I know timestamps are not reliable but I want last write wins behaviour and seamless merge. The paper leaves data layer conflict resolution to the reader. It does suggest sorting by CID.
After reading Merkle-DAGs meet CRDTs whitepaper I took a go to implement a MerkleClock. It's incomplete. I need to maintain the partial order of "occurs before".
https://github.com/samsquire/merkle-crdt
I also implemented part of the YATA algorithm yesterday. So I think I could merge the plain text merging functionality of that with the Merkle CRDT.
https://github.com/samsquire/yata
go-ds-crdt
-
CRDTs Turned Inside Out
I forgot: key-value store using MD-CRDTs was implemented here: https://github.com/ipfs/go-ds-crdt
The trickiest part was not the CRDT, but the DAG traversal with multiple workers processing parallel updates on multiple branches and switching CRDT-DAG roots as they finish branches.
-
We Put IPFS in Brave
In https://github.com/ipfs/go-ds-crdt, every node in the Merkle DAG has a "Priority" field. When adding a new head, this is set to (maximum of the priorities of the children)+1.
Thus, this priority represents the current depth (or height) of the DAG at each node. It is sort of a timestamp and you could use a timestamp, or whatever helps you sort. In the case of concurrent writes, the write with highest priority wins. If we have concurrent writes of same priority, then things are sorted by CID.
The idea here is that in general, a node that is lagging behind or not syncing would have a dag with less depth, therefore its writes would have less priority when they conflict with writes from others that have built deeper DAGs. But this is after all an implementation choice, and the fact that a DAG is deeper does not mean that the last write on a key happened "later".
-
Making CRDTs Byzantine Fault Tolerant [pdf]
The idea of DAG-embedded CRDTs is far from new and was introduced here:
https://arxiv.org/abs/2004.00107 (I'm among the authors)
Unfortunately, the verification that the author proposes (not accepting new updates until the dag below is verified) will need a lot of caveats for real world usage.
Currently we use these CRDTs for a key value database of 40M+ keys in a deployment of ipfs-cluster, which uses https://github.com/ipfs/go-ds-crdt .
- Ask HN: P2P Databases?
- Go-ds-CRDT: distributed datastore using Merkle-CRDTs
- Conflict-free replicated datatypes solve distributed data consistency challenges
-
Data Laced with History: Causal Trees and Operational CRDTs (2018)
Not 100% the thing, but potentially related work in this area:
https://github.com/ipfs/go-ds-crdt
(See link to paper, and links to other projects in it, like OrbitDB).
What are some alternatives?
merkle-crdt - Merkle-Clock CRDT implementation in python
Go IPFS - IPFS implementation in Go [Moved to: https://github.com/ipfs/kubo]
differential-dataflow - An implementation of differential dataflow using timely dataflow on Rust.
verneuil - Verneuil is a VFS extension for SQLite that asynchronously replicates databases to S3-compatible blob stores.
yjs - Shared data types for building collaborative software
Apache Ignite - Apache Ignite
crdt-study - A Python study of distributed, conflict-free Last-Writer-Wins (LWW) undirected graphs
bft-crdts - Byzantine Fault Tolerant CRDT's and other Eventually Consistent Algorithms