Distributed Systems

Open-source projects categorized as Distributed Systems

Top 23 Distributed System Open-Source Projects

Distributed Systems
  1. advanced-java

    😮 Core Interview Questions & Answers For Experienced Java(Backend) Developers | 互联网 Java 工程师进阶知识完全扫盲:涵盖高并发、分布式、高可用、微服务、海量数据处理等领域知识

  2. InfluxDB

    InfluxDB high-performance time series database. Collect, organize, and act on massive volumes of high-resolution data to power real-time intelligent systems.

    InfluxDB logo
  3. awesome-scalability

    The Patterns of Scalable, Reliable, and Performant Large-Scale Systems

    Project mention: The Patterns of Scalable, Reliable, and Performant Large-Scale Systems | news.ycombinator.com | 2024-12-19
  4. etcd

    Distributed reliable key-value store for the most critical data of a distributed system

    Project mention: Securing Kubernetes: Encrypting Data at Rest with kubeadm and containerd on Amazon Linux 2023 | dev.to | 2025-04-15

    curl -LO https://github.com/etcd-io/etcd/releases/download/v3.5.21/etcd-v3.5.21-linux-amd64.tar.gz tar xzf etcd-v3.5.21-linux-amd64.tar.gz

  5. Dubbo

    The java implementation of Apache Dubbo. An RPC and microservice framework.

    Project mention: Dirty code: trusted keeper of errors. Broken windows theory | dev.to | 2025-03-17

    Let's look at the example from Apache Dubbo:

  6. system-design

    Learn how to design systems at scale and prepare for system design interviews

    Project mention: 🚀 Awesome Resources For Learning About System Design ⚡ | dev.to | 2024-11-08

    "System Design" by Karan Pratap Singh: How to design systems at scale and prepare for system design interviews. Link

  7. spacedrive

    Spacedrive is an open source cross-platform file explorer, powered by a virtual distributed filesystem written in Rust.

  8. xgboost

    Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow

    Project mention: What AI/ML Models Should You Use and Why? | dev.to | 2024-10-29

    Boosting Boosting is not a separate ML model but a technique that combines multiple weak learners to create a single model that can generate highly accurate predictions. Xgboost is a common boosting model that supports distributed training, resulting in faster training. According to research by Intel, Xgboost can be more effective than a neural network-based approach for tabular data. In addition, Xgboost is faster to train and doesn’t require as much data as neural networks need.

  9. CodeRabbit

    CodeRabbit: AI Code Reviews for Developers. Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR.

    CodeRabbit logo
  10. nsq

    A realtime distributed messaging platform

    Project mention: RabbitMQ 4.0 Released | news.ycombinator.com | 2024-09-18

    https://nsq.io/ is also very reliable, stable, lightweight, and easy to use.

  11. seaweedfs

    SeaweedFS is a fast distributed storage system for blobs, objects, files, and data lake, for billions of files! Blob store has O(1) disk seek, cloud tiering. Filer supports Cloud Drive, cross-DC active-active replication, Kubernetes, POSIX FUSE mount, S3 API, S3 Gateway, Hadoop, WebDAV, encryption, Erasure Coding.

    Project mention: An Intro to DeepSeek's Distributed File System | news.ycombinator.com | 2025-04-17

    I’m interested in how it is compared to seaweedfs[1], which we use for storing weather data (about 3 PB) for ML training.

    [1] https://github.com/seaweedfs/seaweedfs

  12. awesome-system-design-resources

    Learn System Design concepts and prepare for interviews using free resources.

    Project mention: 🔥 17 Best Free GitHub Repositories to Crack System Design Interviews 🛠️ | dev.to | 2024-12-06

    11. Awesome System Design Resources

  13. grpc-go

    The Go language implementation of gRPC. HTTP/2 based RPC

    Project mention: xAI Grok API Beta | news.ycombinator.com | 2024-10-21

    There's no Remote Procedure Call built into the protocol. JsonRPC is also not RPC in itself.

    It's like GraphQL with resolvers.

    They have you imagine it's a procedure, but you can ignore that.

    Here's the golang gRPC Hello World where the equivalent of a resolver in GraphQL replies directly w/o need for a procedure by that name. https://github.com/grpc/grpc-go/blob/master/examples/hellowo...

  14. conductor

    Conductor is an event driven orchestration platform providing durable and highly resilient execution engine for your applications

    Project mention: Netflix has open-sourced its Maestro Workflow Orchestrator | news.ycombinator.com | 2024-07-22

    I'm a bit confused about what is going on here: This project appears to use Netflix/conductor [0]. But you go to that repo, you see it has been archived, with a message saying it is replaced by Netflix's internal non-OSS version, and by unmentioned community forks – by which I assume they mean Orkes Conductor [1]. But this isn't using Orkes Conductor, it looks like it is using the discontinued Netflix version `com.netflix.conductor:conductor-core:2.31.5` [2] – and an outdated version of it too.

    [0] https://github.com/Netflix/conductor

    [1] https://github.com/conductor-oss/conductor

    [2] https://github.com/Netflix/maestro/blob/e8bee3f1625d3f31d84d...

  15. NATS

    High-Performance server for NATS.io, the cloud and edge native messaging system.

    Project mention: CNCF tells main NATS contributor Synadia that it's free to fork off | news.ycombinator.com | 2025-04-29

    [1] https://github.com/nats-io/nats-server/issues/6832#issuecomm...

  16. rqlite

    The lightweight, user-friendly, distributed relational database built on SQLite.

    Project mention: The definitive guide to using Django with SQLite in production 💡 | dev.to | 2025-01-18

    rqlite: The lightweight, user-friendly, distributed relational database built on SQLite

  17. system-design

    A resource to help busy software engineers become good at system design 👇 (by systemdesign42)

    Project mention: System Design Resources that are Not ByteByteGo | dev.to | 2024-06-03

    “System Design Newsletter” by Neo Kim

  18. Nomad

    Nomad is an easy-to-use, flexible, and performant workload orchestrator that can deploy a mix of microservice, batch, containerized, and non-containerized applications. Nomad is easy to operate and scale and has native Consul and Vault integrations.

    Project mention: IBM Completes Acquisition of HashiCorp | news.ycombinator.com | 2025-02-27

    20k+ nodes and 200k+ allocs. To be fair, Kubernetes cannot support this large of a cluster.

    Most of my issues with it aren't related to the scale though. I wasn't involved in the operations of the cluster, I was just a user of Nomad trying to run a few thousand stateful allocs. Without custom resources and custom controllers, managing stateful services was a pain in the ass. Critical bugs would also often take years to get fixed. I had lots of fun getting paged in the middle of the night because 2 allocs would suddenly decide they now have the same index (https://github.com/hashicorp/nomad/issues/10727)

  19. temporal

    Temporal service

    Project mention: Launch HN: Stack Auth (YC S24) – An Open-Source Auth0/Clerk Alternative | news.ycombinator.com | 2024-08-08

    Just for clarification, So you can't really host this without open-sourcing my product (since your server is AGPL). Isn't it a stretch to call this really open-source? I compare this to something like a temporal which I can self-host without worrying (and which I believe is MIT license [https://github.com/temporalio/temporal/blob/main/LICENSE])

  20. Akka

    A platform to build and run apps that are elastic, agile, and resilient. SDK, libraries, and hosted environments.

  21. Apache ZooKeeper

    Apache ZooKeeper

    Project mention: Mastering Apache Kafka: Powering Modern Data Pipelines | dev.to | 2025-01-16

    Zookeeper is a distributed coordination service used in older versions of Kafka to manage cluster metadata, leader election, and configuration. It ensures consistency and synchronization across Kafka brokers.

  22. juicefs

    JuiceFS is a distributed POSIX file system built on top of Redis and S3.

    Project mention: Development Environment Configuration | dev.to | 2025-01-19

    Object Storage: JuiceFS, Minio

  23. NebulaGraph Database

    A distributed, fast open-source graph database featuring horizontal scalability and high availability (by vesoft-inc)

  24. Trino

    Official repository of Trino, the distributed SQL query engine for big data, former

    Project mention: Every Database Will Support Iceberg — Here's Why | dev.to | 2025-04-22

    Traditional databases — PostgreSQL, MySQL, etc. — store their data in proprietary formats. That format is optimized for that engine and can’t be directly accessed by anything else. Even if something like Trino can connect to Postgres, it’s still running queries through Postgres itself, not reading its storage directly. You’re just a client.

  25. awesome-distributed-systems

    A curated list to learn about distributed systems

  26. SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Distributed Systems discussion

Log in or Post with

Distributed Systems related posts

  • Kronotop: Horizontally scalable, distributed, transactional document database

    2 projects | news.ycombinator.com | 30 Apr 2025
  • What If We Could Rebuild Kafka from Scratch?

    8 projects | news.ycombinator.com | 25 Apr 2025
  • Py4J: Enables Python programs to dynamically access arbitrary Java objects

    1 project | news.ycombinator.com | 12 Apr 2025
  • My Learnings About Etcd

    1 project | dev.to | 10 Apr 2025
  • Invoice Processing With Autokitteh

    2 projects | dev.to | 2 Apr 2025
  • Longhorn: Cloud native distributed block storage for Kubernetes

    2 projects | news.ycombinator.com | 29 Mar 2025
  • Building Stateful AI Research Agent with openai-agents and AutoKitteh

    1 project | dev.to | 26 Mar 2025
  • A note from our sponsor - InfluxDB
    influxdata.com | 30 Apr 2025
    Collect, organize, and act on massive volumes of high-resolution data to power real-time intelligent systems. Learn more →

Index

What are some of the best open-source Distributed System projects? This list will help you:

# Project Stars
1 advanced-java 77,500
2 awesome-scalability 61,670
3 etcd 49,173
4 Dubbo 40,923
5 system-design 35,296
6 spacedrive 34,330
7 xgboost 26,861
8 nsq 25,233
9 seaweedfs 24,210
10 awesome-system-design-resources 22,623
11 grpc-go 21,776
12 conductor 20,584
13 NATS 16,968
14 rqlite 16,521
15 system-design 15,417
16 Nomad 15,397
17 temporal 13,760
18 Akka 13,144
19 Apache ZooKeeper 12,458
20 juicefs 11,531
21 NebulaGraph Database 11,289
22 Trino 11,216
23 awesome-distributed-systems 11,018

Sponsored
InfluxDB high-performance time series database
Collect, organize, and act on massive volumes of high-resolution data to power real-time intelligent systems.
influxdata.com

Did you know that Go is
the 4th most popular programming language
based on number of references?