teletype-crdt
DISCONTINUED
diamond-types
Our great sponsors
- Appwrite - The Open Source Firebase alternative introduces iOS support
- Klotho - AWS Cloud-aware infrastructure-from-code toolbox [NEW]
- InfluxDB - Access the most powerful time series database as a service
- ONLYOFFICE ONLYOFFICE Docs — document collaboration in your environment
- Sonar - Write Clean JavaScript Code. Always.
teletype-crdt | diamond-types | |
---|---|---|
3 | 14 | |
732 | 1,228 | |
- | - | |
0.0 | 2.7 | |
6 months ago | 9 days ago | |
JavaScript | Rust | |
MIT License | - |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
teletype-crdt
-
5000x Faster CRDTs: An Adventure in Optimization
Cool! It'd be interesting to see those CRDT implementations added to Kevin Jahns' CRDT Benchmarks page[1]. The LogootSplit paper looks interesting. It looks like xray is abandoned, and I'm not sure about teletype. Though teletype's CRDT looks to be entirely implemented in javascript[2]? If the authors are around I'd love to see some benchmarks so we can compare approaches and learn what actually works well.
And I'm not surprised these techniques have been invented before. Realising a tree is an appropriate data structure here is a pretty obvious step if you have a mind for data structures.
To name it, I often find myself feeling defensive when people read my work and respond with a bunch of links to academic papers. Its probably totally unfair and a complete projection from my side, but I hear a voice in my head reword your comment to instead say something awful like: "Cool, but everything you did was done before. Even if they didn't make any of their work practical, usable or good they still published first and you obviously didn't do a good enough literature review if you didn't know that." And I feel an unfair defensiveness arise in me as a result that wants to find excuses to dismiss the work, even if the work might be otherwise interesting.
Its hard to compare their benchmark results because they used synthetic randomized editing traces, which always have different performance profiles than real edits for this stuff. Their own university gathered some great real world data in an earlier study. It would have been much more instructive if that data set was used here. At a glance their RAM usage looks to be about 2 orders of magnitude worse than diamond-types or yjs. And their CPU usage... ?? I can't tell because they have no tables of results. Just some hard to read charts with log scales, so you can't even really eyeball the figures. So its really hard to tell if their work ends up performance-competitive without spending a couple days getting their enterprise style java code running with a better data set. Do you think thats worth doing?
-
Atom Teletype's peer-to-peer connection
1) crdt
diamond-types
-
Elixir and Rust is a good mix
But I think thats about it. Maybe there's more manually specified types in "normal" rust because most functions are smaller than that. But, it doesn't feel so bad. In this case I could probably even remove the explicit type annotation for that queue definition if I wanted to, but it makes the compiler's errors better leaving it in.
[1] https://github.com/josephg/diamond-types/blob/66025b99dbe390...
-
Automerge 2.0
diamond-types (for reference for others [0]) still only supports plain text, is that right? I was thinking of using it for more general use cases such as an offline habit tracker, which isn't text of course, but I was interested to hear more on the progress towards other data types such as generic JSON data.
Currently for this use case I've been using autosurgeon [1] so far which has a nice Rust API for structs, even if it might be slower than yjs (or yrs, its Rust implementation) or diamond-types.
I'd also add
- Local First Software [https://news.ycombinator.com/item?id=31594613 (28 comments)] by Martin Kleppmann (who works on Automerge at the company Ink and Switch, perhaps better known as the author of Designing Data Intensive Applications), which introduces Automerge
- CRDTs: The Hard Parts [https://news.ycombinator.com/item?id=23802208 (124 comments)], a video talk also by Kleppmann
- CRDTs go brrr, 5000x faster CRDT implementations [https://news.ycombinator.com/item?id=28017204 (151 comments)], by the creator of another CRDT in Rust library, Diamond Types [https://github.com/josephg/diamond-types]
So exciting! Strangely enough, a couple of hours before this release, we just managed to wrap our heads around Yjs after playing with it on and off for a few weeks!
For anyone not up to date with the world of CRDTs, Seph Gentle's two blog posts have become legendary:
* https://josephg.com/blog/crdts-are-the-future/
* https://josephg.com/blog/crdts-go-brrr/
these are also worth checking out:
* https://github.com/y-crdt/y-crdt (rust implementation started by the creator of Yjs, Kevin Jahns)
* https://github.com/y-crdt/ypy (python bindings for the rust implementation)
* https://github.com/josephg/diamond-types (Seph Gentle's rust implementation of YATA, the algorith behind Yjs)
-
You might not need a CRDT
I'm working on a CRDT to solve this problem too[1]. How do you plan on implementing collaborative text editing on top of your event-reordering system? Off the top of my head I can't think of a way to implement text on your proposed system which would be performant and simple.
-
Generalizing coroutines - The Rust Language Design Team
For example, this file implements a complex iterator via a struct and really complex next() method. This file was about 1/3rd the size before I manually rewrote it into a "continuation passing" style. I find it significantly harder to read and maintain in its current form.
-
WebAssembly 2.0 Working Draft
> In this case, the bottleneck at 9 million LoC is not CPU cycles but memory usage. That's where I am considering pushing down into WebAssembly
How often does this come up in practice? I can't think of many files I've opened which were 9 million lines long. And you say "LoC" (lines of code). Are you doing syntax highlighting on 9 million lines of source code in javascript? Thats impressive!
> I guess my point is why do you need balanced trees? Is this a CRDT specific thing? Can you implement CRDT with just an array of lines / gap buffer?
Of course! Its just going to be slower. I made a simple reference implementation of Yjs, Automerge and Sync9's list types in javascript here[1]. This code is not optimized, and it takes 30 seconds to process an editing trace that diamond types (in native rust) takes 0.01 seconds to process. We could speed that up - yjs does the same thing in 1 second. But I don't think javascript will ever run as fast as optimized rust code.
The b-tree in diamond types is used for merging. If you're merging 2 branches, we need to map insert locations from the incoming branch into positions in the target (merged) branch. As items are inserted, the mapping changes dynamically. The benchmark I've been using for this is how long it takes to replay (and re-merge) all the changes in the most edited file in the nodejs git repository. That file has just shy of 1M single character insert / delete operations. If you're curious, the causal graph of changes looks like this[2].
Currently it takes 250ms to re-merge the entire causal graph. This is much slower than I'd like, but we can cache the merged positions in about 4kb on disk or something so we only need to do it once. I also want to replace the b-tree with a skip list. I think that'll make the code faster and smaller.
A gap buffer in javascript might work ok... if you're keen, I'd love to see that benchmark. The code to port is here: [3]
> Undo support -> In which case, you only have to stack / remember the set of commands and not have to store the state on every change. I'm not sure if this overlaps with the data structure choice, other than implementation details.
Yeah, I basically never store a snapshot of the state. Not on every change. Not really at all. Everything involves sending around patches. But you can't just roll back the changes when you undo.
Eg: I type "aaa" at position 0 (the start of the document). You type "bbb" at the start of the document. The document is now "bbbaaa". I hit undo. What should happen? Surely, we delete the "aaa" - now at position 3.
Translating from position 0 to position 3 is essentially the same algorithm we need to run in order to merge.
> I was just looking into TypedArrays.
I tried optimizing a physics library a few years ago by putting everything in typedarrays and it was weirdly slower than using raw javascript arrays. I have no idea why - but maybe thats fixed now.
TypedArrays are useful, but they're no panacea. You could probably write a custom b-tree on top of a typedarray in javascript if you really want to - assuming your data also fits into typedarrays. But at that point you may as well just use wasm. It'll be way faster and more ergonomic.
[1] https://github.com/josephg/reference-crdts
[2] https://home.seph.codes/public/node_graph.svg
[3] https://github.com/josephg/diamond-types/tree/master/src/lis...
-
I was wrong. CRDTs are the future
Hi everyone! Author here. I'm happy to answer questions.
I wrote this a couple years ago. Since then I've been working on my own CRDT called Diamond Types[1], which uses a lot of these ideas to be bonkers fast. I've built several OT based collaborative editing systems, and diamond types is much faster than any of them - though rust and wasm might be the real MVPs here. I wrote a follow-up to this article last year when I got that working, talking about how some of the optimizations work. That article is here[2].
A fair bit has changed since I wrote that article. Yjs has started a rewrite in rust (called yrs[3]). And Automerge has apparently dramatically improved performance based on some of the ideas I talk about in this article. Oh, and diamond types has been rewritten from the ground up. Its now about 5x faster than it was last year, by completely changing the internal structure. But thats a story for another day.
Unfortunately I still only support collaborative text editing. Adding full JSON support comes soon, after I document some more of the tricks I'm doing. Its really fun work!
Why do I only support collaborative text editing? Because I care about performance, and text CRDT performance is hard because you have so many individual changes. (One for each keystroke!). Making text editing fast means everything is fast. But we've still got to do the work. To make that happen, my plan is to add full JSON editing support to diamond types using shelf[4]. Shelf is a super simple CRDT which fits in 100 lines of javascript.
[1] https://github.com/josephg/diamond-types/
[2] https://josephg.com/blog/crdts-go-brrr/
-
Conflict-Free Replicated Data Types (CRDT)
Yep. I've done something very similar on top of Diamond Types for a little personal wiki. This page[1] is synced between all users who have the page open. Its a remarkably small piece of code, outside of the CRDT library itself (which is in rust via wasm). The way it works is:
- On page load, the server sends the whole CRDT document to the browser, and the server streams changes from that point onwards.
- When a change happens in the browser, it makes that change locally then and sends anything the server doesn't know about upstream.
- Whenever the server finds out about a new change, it re-broadcasts that change to any subscribed browser streams.
I'm using the braid HTTP protocol for changes - but we could easily switch to a SSE or websocket solution. It doesn't really matter.
At the moment I'm just using flat files for storage, but there's nothing stopping you using a database instead, except that its a bit awkward to use efficient CRDT packing techniques in a database.
[1] https://wiki.seph.codes/hn
Code is here, if anyone is interested. The whole thing is a few hundred lines all up: https://github.com/josephg/diamond-types/tree/0cb5d7ecf49364...
-
Writing Redux Reducers in Rust
With each change we just send the missing operations. Https://wiki.seph.codes/reddit if you want to mess around and see it in action via wasm. The code which runs this wiki is here.
What are some alternatives?
crdt-woot - Implementation of collaborative editing algorithm CRDT WOOT.
crdt-benchmarks - A collection of CRDT benchmarks
automerge - A JSON-like data structure (a CRDT) that can be modified concurrently by different users, and merged again automatically.
dotted-logootsplit - A delta-state block-wise sequence CRDT
y-crdt - Rust port of Yjs
comic-shanns - a classy font
cow-list - Copy-On-Write iterable list
Selenite - An Experimental Rust Crate for Post-Quantum Code-Signing Certificates.
teletype-server - Server-side application that facilitates peer discovery for collaborative editing sessions in Teletype
lang-team - Home of the Rust language design team.