The Raft Consensus Algorithm

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
  • goraft

    A basic Raft implementation in Go.

  • I had a fun time recently implementing Raft leader election and log replication (i.e. I didn't get to snapshotting/checkpointing) recently. One of the most challenging projects I've tried to do.

    I collected all the resources I found useful while doing it here: https://github.com/eatonphil/goraft#references. This includes Diego Ongaro's thesis and his TLA+ spec.

    Some people say Figure 2 of the Raft paper has everything you need but I'm pretty sure that's just not true. It's a little bit more vague than looking at the TLA+ spec to me anyway.

  • maelstrom

    A workbench for writing toy implementations of distributed systems.

  • Maelstrom [1], a workbench for learning distributed systems from the creator of Jepsen, includes a simple (model-checked) implementation of Raft and an excellent tutorial on implementing it.

    Raft is a simple algorithm, but as others have noted, the original paper includes many correctness details often brushed over in toy implementations. Furthermore, the fallibility of real-world hardware (handling memory/disk corruption and grey failures), the requirements of real-world systems with tight latency SLAs, and a need for things like flexible quorum/dynamic cluster membership make implementing it for production a long and daunting task. The commit history of etcd and hashicorp/raft, likely the two most battle-tested open source implementations of raft that still surface correctness bugs on the regular tell you all you need to know.

    The tigerbeetle team talks in detail about the real-world aspects of distributed systems on imperfect hardware/non-abstracted system models, and why they chose viewstamp replication, which predates Paxos but looks more like Raft.

    [1]: https://github.com/jepsen-io/maelstrom/

    [2]: https://github.com/tigerbeetle/tigerbeetle/blob/main/docs/DE...

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • tigerbeetle

    The distributed financial transactions database designed for mission critical safety and performance.

  • Maelstrom [1], a workbench for learning distributed systems from the creator of Jepsen, includes a simple (model-checked) implementation of Raft and an excellent tutorial on implementing it.

    Raft is a simple algorithm, but as others have noted, the original paper includes many correctness details often brushed over in toy implementations. Furthermore, the fallibility of real-world hardware (handling memory/disk corruption and grey failures), the requirements of real-world systems with tight latency SLAs, and a need for things like flexible quorum/dynamic cluster membership make implementing it for production a long and daunting task. The commit history of etcd and hashicorp/raft, likely the two most battle-tested open source implementations of raft that still surface correctness bugs on the regular tell you all you need to know.

    The tigerbeetle team talks in detail about the real-world aspects of distributed systems on imperfect hardware/non-abstracted system models, and why they chose viewstamp replication, which predates Paxos but looks more like Raft.

    [1]: https://github.com/jepsen-io/maelstrom/

    [2]: https://github.com/tigerbeetle/tigerbeetle/blob/main/docs/DE...

  • marmot

    A distributed SQLite replicator built on top of NATS

  • I've written a whole SQLite replication system that works on top of RAFT ( https://github.com/maxpert/marmot ). Best part is RAFT has a well understood and strong library ecosystem as well. I started of with libraries and when I noticed I am reimplementing distributed streams, I just took off the shelf implementation (https://docs.nats.io/nats-concepts/jetstream) and embedded it in system. I love the simplicity and reasoning that comes with RAFT. However I am playing with epaxos these days (https://www.cs.cmu.edu/~dga/papers/epaxos-sosp2013.pdf), because then I can truly decentralize the implementation for truly masterless implementation. Right now I've added sharding mechanism on various streams so that in high load cases masters can be distributed across nodes too.

  • fastapi-raft

    Python implementation of the Raft Distributed Consensus Algorithm with ASGI + Starlette + FastAPI

  • I had to implement Raft for a network programming course during my bachelors and I had the same experience regarding how gentle the paper was. Especially for people new to distributed algorithms, I can really recommend it.

    My implementation is probably not that great, but I put it online anyway if anyone is interested: https://github.com/skowalak/fastapi-raft/

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts