peritext vs reference-crdts

peritext

A CRDT for asynchronous rich-text collaboration, where authors can work independently and then merge their changes. (by inkandswitch)

Suggest topics

Source Code

inkandswitch.com

Suggest alternative

Edit details

reference-crdts

Simple, tiny spec-compliant reference implementations of Yjs and Automerge's list types. (by josephg)

Suggest topics

Source Code

Suggest alternative

Edit details

Our great sponsors

SurveyJS - Open-Source JSON Form Builder to Create Dynamic Forms Right in Your App

InfluxDB - Power Real-Time Data Analytics at Scale

WorkOS - The modern identity platform for B2B SaaS

Our great sponsors

peritext		reference-crdts
	Project
20	Mentions	5
615	Stars	110
2.6%	Growth	-
0.0	Activity	6.6
over 1 year ago	Latest Commit	5 months ago
TypeScript	Language	TypeScript
MIT License	License	-

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

peritext

Posts with mentions or reviews of peritext. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-09-03.

Cola: A text CRDT for real-time collaborative editing
2 projects | news.ycombinator.com | 3 Sep 2023

This doesn’t appear to support rich text formatting ranges like bold, italic, etc - unless I’m missing something in the API. AFAIK Peritext is still the state of the art in rich text CRDT algorithms https://www.inkandswitch.com/peritext/
I’d love to see this build the rich text stuff from the Peritext algorithm.
The Cloud Is a Prison. Can the Local-First Software Movement Set Us Free?
1 project | news.ycombinator.com | 3 Aug 2023

The work Ink & Switch (unaffiliated) do has been an inspiration to my with regard to local-first and decentralized software: https://www.inkandswitch.com
They have a quasi-manifesto on local-first (https://www.inkandswitch.com/local-first/) and have published the best rich text CRDT around, Peritext: https://www.inkandswitch.com/peritext/
Lots of interesting work happening in this space.
Figma Is a File Editor
3 projects | news.ycombinator.com | 13 Jul 2023

Take a look at https://automerge.org/ and the stack those folks are building. You're exactly right that it's a difficult balance (specifically the trick is proving commutativity for the domain-specific data of your application). But automerge (and then https://github.com/inkandswitch/peritext) show it's at least possible. Good stuff.
Ask HN: What is new in Algorithms / Data Structures these days?
15 projects | news.ycombinator.com | 10 May 2023

Yes - The BFT problem only matters when you have Byzantine actors. But I think users deserve and expect the system to be reasonably well behaved and predictable in all situations. Anything publically writable, for example, needs BFT resilience. Or any video game.
As for the prosemirror problem, I assume you’re talking about weird merges from users putting markdown in a text crdt? You’re totally right - this is a problem. Text CRDTs treat documents as a simple sequence of characters. And that confuses a lot of structured formats. For example, if two users concurrently bold the same word, the system should see that users agree that it should be bolded. But if that “bold” intent is translated into “insert double asterisks here and here”, you end up with 4 asterisks before and after the text, and that confused markdown parsers. The problem is that a text crdt doesn’t understand markdown.
JSON editing has similar problems. I’ve heard of plenty of people over the years putting json text into a text crdt, only to find that when concurrent edits happen, the json grows parse errors. Eg if two users concurrently insert “a” and “b” into an empty list. The result is [“a””b”] which can’t be parsed.
The answer to both of these problems is to use CRDTs which understand the shape of your data structure. Eg, use a json OT/crdt system for json data (like sharedb or automerge). Likewise, if the user is editing rich text in prosemirror then you want a rich text crdt like peritext. Rich text CRDTs add the concept of annotations - so if two users bold overlapping regions of text, the crdt understands that the result should be that the entire region is bolded. And that can be translated back to markdown if you want.
The ink & switch people did a great write up of how this sort of crdt works here: https://www.inkandswitch.com/peritext/
Edge cases in collaborative rich text editing (2021)
1 project | news.ycombinator.com | 21 Apr 2023
You might not need a CRDT
9 projects | news.ycombinator.com | 5 Dec 2022

> I'm looking out for practical CRDT ideas that works well with richtext.
Have you seen Peritext from Ink & Switch? https://www.inkandswitch.com/peritext/ It's relatively new, but is a CRDT aimed at rich text!
CRDTs make multiplayer text editing part of Zed's DNA
10 projects | news.ycombinator.com | 1 Dec 2022

To put it in a different perspective, plain text editing has well-solved CRDT patterns. But, semantic data-structures like rich-text or syntax trees is what's tricky and has unsolved challenges.
Peritext[1] is the only one that came close to solving rich-text, but even that one left out important aspect of rich-text editing like handling list & table operations as "work to be done later".
For people interested on why it's difficult to build CRDTs for richtext, here's a piece I wrote a year back: https://writer.zohopublic.com/writer/published/grcwy5c699d67...
Related HN discussion: https://news.ycombinator.com/item?id=29433896
[1] https://github.com/inkandswitch/peritext
Peritext – A CRDT for Rich-Text Collaboration
1 project | news.ycombinator.com | 27 Nov 2022
Evan Wallace CRDT Algorithms
5 projects | news.ycombinator.com | 27 Nov 2022

Anyone unsure of what a CRDT is, this is the perfect intro: https://www.inkandswitch.com/peritext/
The two most widely used CRDT implementations (combining JSON like general purpose types and rich text editing types) are:
- Automerge https://github.com/automerge/automerge
- Yjs https://github.com/yjs/yjs
Is Svelte capable of a Google Docs & Sheets clone?
3 projects | /r/sveltejs | 21 Nov 2022

Svelte is, but that is your smallest problem. You want to look into CRDTs (conflict-free replicated data types) to offer true (offline) collaboration. A popular JS library to solve this complex problem is called [automerge](Conflict-free replicated data type). A rather recent development in that area specifically for text-based content is Peritext. Also check out this interactive tutorial about CRDTs.

reference-crdts

Posts with mentions or reviews of reference-crdts. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2022-12-01.

CRDTs make multiplayer text editing part of Zed's DNA
10 projects | news.ycombinator.com | 1 Dec 2022

> The goog version seems to work well but I have had nothing but frustration with ms word. Bad merges and weird states are typical, particularly from the fat client.
Argh not getting this stuff right is really frustrating. I've been working on collaborative editing for over a decade now, and I still can't implement any of these algorithms correctly without the help of a fuzz testing. But fuzz testing done right finds all of these problems! There's no excuse!
Fuzzers work so well here because all of these algorithms have a clear correctness criteria: After syncing, state should always converge to the same result. So its pretty easy to write code which does this in a loop:
1. Generates some random changes on some fake "peers"
2. Picks 2 peers at random and sync their changes, using your new fancy synchronization algorithm
3. Assert that the state has converged between the peers
I've been working on this stuff for over a decade. I've implemented dozens of these algorithms. And every single time I write a fuzzy boi to check my work I find convergence bugs. Playing whack-a-mole with a fuzzer is a rite of passage for implementing systems like this.
When your fuzzer runs all night, you should never have lingering convergence bugs like you're describing with Word.
As an example, here's a simple fuzzer for a reference list CRDT implementation: https://github.com/josephg/reference-crdts/blob/9f4f9c3a97b4...
The code is so small it almost fits on my laptop screen.
WebAssembly 2.0 Working Draft
21 projects | news.ycombinator.com | 19 Apr 2022

> In this case, the bottleneck at 9 million LoC is not CPU cycles but memory usage. That's where I am considering pushing down into WebAssembly
How often does this come up in practice? I can't think of many files I've opened which were 9 million lines long. And you say "LoC" (lines of code). Are you doing syntax highlighting on 9 million lines of source code in javascript? Thats impressive!
> I guess my point is why do you need balanced trees? Is this a CRDT specific thing? Can you implement CRDT with just an array of lines / gap buffer?
Of course! Its just going to be slower. I made a simple reference implementation of Yjs, Automerge and Sync9's list types in javascript here[1]. This code is not optimized, and it takes 30 seconds to process an editing trace that diamond types (in native rust) takes 0.01 seconds to process. We could speed that up - yjs does the same thing in 1 second. But I don't think javascript will ever run as fast as optimized rust code.
The b-tree in diamond types is used for merging. If you're merging 2 branches, we need to map insert locations from the incoming branch into positions in the target (merged) branch. As items are inserted, the mapping changes dynamically. The benchmark I've been using for this is how long it takes to replay (and re-merge) all the changes in the most edited file in the nodejs git repository. That file has just shy of 1M single character insert / delete operations. If you're curious, the causal graph of changes looks like this[2].
Currently it takes 250ms to re-merge the entire causal graph. This is much slower than I'd like, but we can cache the merged positions in about 4kb on disk or something so we only need to do it once. I also want to replace the b-tree with a skip list. I think that'll make the code faster and smaller.
A gap buffer in javascript might work ok... if you're keen, I'd love to see that benchmark. The code to port is here: [3]
> Undo support -> In which case, you only have to stack / remember the set of commands and not have to store the state on every change. I'm not sure if this overlaps with the data structure choice, other than implementation details.
Yeah, I basically never store a snapshot of the state. Not on every change. Not really at all. Everything involves sending around patches. But you can't just roll back the changes when you undo.
Eg: I type "aaa" at position 0 (the start of the document). You type "bbb" at the start of the document. The document is now "bbbaaa". I hit undo. What should happen? Surely, we delete the "aaa" - now at position 3.
Translating from position 0 to position 3 is essentially the same algorithm we need to run in order to merge.
> I was just looking into TypedArrays.
I tried optimizing a physics library a few years ago by putting everything in typedarrays and it was weirdly slower than using raw javascript arrays. I have no idea why - but maybe thats fixed now.
TypedArrays are useful, but they're no panacea. You could probably write a custom b-tree on top of a typedarray in javascript if you really want to - assuming your data also fits into typedarrays. But at that point you may as well just use wasm. It'll be way faster and more ergonomic.
[1] https://github.com/josephg/reference-crdts
[2] https://home.seph.codes/public/node_graph.svg
[3] https://github.com/josephg/diamond-types/tree/master/src/lis...

What are some alternatives?

When comparing peritext and reference-crdts you can also consider the following projects:

automerge - A JSON-like data structure (a CRDT) that can be modified concurrently by different users, and merged again automatically.

wai - A language binding generator for `wai` (a precursor to WebAssembly interface types)

y-crdt - Rust port of Yjs

multi-memory - Multiple per-module memories for Wasm

dokieli - :bulb: dokieli is a clientside editor for decentralised article publishing, annotations and social interactions

diamond-types - The world's fastest CRDT. WIP.

threlte - 3D framework for Svelte

uwm-masters-thesis - My thesis for my Master's in Computer Science degree from the University of Wisconsin - Milwaukee.

automerge-rs - Rust implementation of automerge [Moved to: https://github.com/automerge/automerge]

wit-bindgen - A language binding generator for WebAssembly interface types

yjs - Shared data types for building collaborative software

peritext vs automerge reference-crdts vs wai peritext vs y-crdt reference-crdts vs multi-memory peritext vs dokieli reference-crdts vs diamond-types peritext vs threlte reference-crdts vs uwm-masters-thesis peritext vs automerge-rs reference-crdts vs wit-bindgen peritext vs yjs reference-crdts vs yjs

Compare peritext vs reference-crdts and see what are their differences.

peritext

reference-crdts

peritext

reference-crdts

What are some alternatives?