CRDTs make multiplayer text editing part of Zed's DNA

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • SurveyJS - Open-Source JSON Form Builder to Create Dynamic Forms Right in Your App
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • reference-crdts

    Simple, tiny spec-compliant reference implementations of Yjs and Automerge's list types.

    They both have lots of obscure edge cases. And in both cases, I can't write a correct algorithm without a fuzzer to save myself. (And I don't trust anyone else to, either).

    CRDTs aren't that complex on the surface - this messy file[1] implements 4 different list CRDT algorithms in a few hundred lines of code. And CRDTs are pretty simple for JSON structures, which is not at all true for OT algorithms.

    But text CRDTs in particular need a whole lot more thought around optimization because of how locations work. Locations in list/text CRDTs generally work by giving every item a globally-unique ID and referencing that ID with every change. So, inserts work by saying "Insert between ID xxx and yyy". But, if you have a text document which needs to store a GUID for every keystroke which has ever happened, disk and memory usage becomes crazy bad. And if you don't have a clever indexing strategy, looking up IDs will bottleneck your program.

    In diamond types (my CRDT), I solve that with a pancake stack of optimizations which all feed into each other. Ids are (agent, sequence number) pairs so I can run-length encode them. Internally those IDs get flattened into integers, and I use a special hand-written b-tree which internally run-length encodes all the items it stores. I've iterated on Yjs's file format to get the file size down to ~0.05 bytes of overhead per inserted character in some real world data sets, which is wild given each inserted character needs to store about 8 different fields. (Insert position, ID (agent + seq), parents, the inserted character, and so on).

    CRDTs aren't that complex. Making them small and fast is the challenge. Automerge has been around for years and still takes hundreds of megabytes of ram to process a simple editing trace for a 10 page text document. Even in their rust implementation.

    The current version of diamond types is many times faster than any OT algorithm I've written in the past thanks to all those optimizations. Now that I've been down this road, I could optimize an OT algorithm in the same way if necessary, but you don't really need to.

    Other kinds of CRDTs don't need these optimizations. (Eg the sort you'd use for a normal database). There's two reasons: 1. When editing text, you make an edit with each keystroke. So you have a lot of edits. 2. Merging list items correctly is orders of magnitude more complex than merging a database entry. If you just want CRDTs for editing a database of records, that'd probably only take a few dozen lines.

    [1] https://github.com/josephg/reference-crdts/blob/main/crdts.t...

  • liveblocks

    Liveblocks is a platform to ship collaborative features like comments, notifications, text editors in minutes instead of months.

    Very cool use case for CRDTs! I've seen a bunch of different use cases from other products like https://liveblocks.io/ and https://electric-sql.com/. It's interesting how CRDTs are now taking hold so much for all these collaborative syncing scenarios. Wonder what's driving the proliferation now given they've been around for awhile?

  • SurveyJS

    Open-Source JSON Form Builder to Create Dynamic Forms Right in Your App. With SurveyJS form UI libraries, you can build and style forms in a fully-integrated drag & drop form builder, render them in your JS app, and store form submission data in any backend, inc. PHP, ASP.NET Core, and Node.js.

  • xi-editor

    A modern editor with a backend written in Rust.

    Raph Levien posted a retrospective about using CRDT’s for collaborative editing in xi-editor here [1]. His conclusion is

    “I come to the conclusion that the CRDT is not pulling its (considerable) weight. When I think about a future evolution of xi-editor, I see a much brighter future with a simpler, largely synchronous model, that still of course has enough revision tracking to get good results with asynchronous peers like the language server.”

    [1]https://github.com/xi-editor/xi-editor/issues/1187#issuecomm...

  • peritext

    A CRDT for asynchronous rich-text collaboration, where authors can work independently and then merge their changes.

    To put it in a different perspective, plain text editing has well-solved CRDT patterns. But, semantic data-structures like rich-text or syntax trees is what's tricky and has unsolved challenges.

    Peritext[1] is the only one that came close to solving rich-text, but even that one left out important aspect of rich-text editing like handling list & table operations as "work to be done later".

    For people interested on why it's difficult to build CRDTs for richtext, here's a piece I wrote a year back: https://writer.zohopublic.com/writer/published/grcwy5c699d67...

    Related HN discussion: https://news.ycombinator.com/item?id=29433896

    [1] https://github.com/inkandswitch/peritext

  • yjs-demos

    A collection of demos for Yjs

  • yjs

    Shared data types for building collaborative software

    Yjs is being quite heavily used in the industry[1], and being researched for even more companies. There are also demos showing how to integrate it with an existing rich text editors[2]. If you have some ideas about the missing parts, you could also open topic on discuss.yjs.dev - the documentation page (https://docs.yjs.dev) has tons of useful links.

    Re. other purpose projects - Yjs/Yrs main target are sequential data structures (text, arrays), but it also has support for maps and xml-like elements. In general you can build most data structures with it. I agree that it would be nice to have some other applications in demos though.

    [1] https://docs.yjs.dev/yjs-in-the-wild

  • aper

    A Rust data structure library built on state machines.

    Our Aper (https://aper.dev) implements a number of similar concepts (state machine replication with optimistic local transitions + rollback). I 100% agree that it’s an easier model to reason about.

    Your approach with cursors is clever, that part I haven’t seen elsewhere.

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts