Java Distributed Systems

Open-source Java projects categorized as Distributed Systems

Top 23 Java Distributed System Projects

Distributed Systems
  1. advanced-java

    😮 Core Interview Questions & Answers For Experienced Java(Backend) Developers | 互联网 Java 工程师进阶知识完全扫盲:涵盖高并发、分布式、高可用、微服务、海量数据处理等领域知识

  2. InfluxDB

    InfluxDB high-performance time series database. Collect, organize, and act on massive volumes of high-resolution data to power real-time intelligent systems.

    InfluxDB logo
  3. Dubbo

    The java implementation of Apache Dubbo. An RPC and microservice framework.

    Project mention: Dirty code: trusted keeper of errors. Broken windows theory | dev.to | 2025-03-17

    Let's look at the example from Apache Dubbo:

  4. awesome-system-design-resources

    Learn System Design concepts and prepare for interviews using free resources.

    Project mention: 🔥 17 Best Free GitHub Repositories to Crack System Design Interviews 🛠️ | dev.to | 2024-12-06

    11. Awesome System Design Resources

  5. conductor

    Conductor is an event driven orchestration platform providing durable and highly resilient execution engine for your applications

    Project mention: Netflix has open-sourced its Maestro Workflow Orchestrator | news.ycombinator.com | 2024-07-22

    I'm a bit confused about what is going on here: This project appears to use Netflix/conductor [0]. But you go to that repo, you see it has been archived, with a message saying it is replaced by Netflix's internal non-OSS version, and by unmentioned community forks – by which I assume they mean Orkes Conductor [1]. But this isn't using Orkes Conductor, it looks like it is using the discontinued Netflix version `com.netflix.conductor:conductor-core:2.31.5` [2] – and an outdated version of it too.

    [0] https://github.com/Netflix/conductor

    [1] https://github.com/conductor-oss/conductor

    [2] https://github.com/Netflix/maestro/blob/e8bee3f1625d3f31d84d...

  6. Apache ZooKeeper

    Apache ZooKeeper

    Project mention: Mastering Apache Kafka: Powering Modern Data Pipelines | dev.to | 2025-01-16

    Zookeeper is a distributed coordination service used in older versions of Kafka to manage cluster metadata, leader election, and configuration. It ensures consistency and synchronization across Kafka brokers.

  7. Trino

    Official repository of Trino, the distributed SQL query engine for big data, former

    Project mention: Twitter's 600-Tweet Daily Limit Crisis: Soaring GCP Costs and the Open Source Fix Elon Musk Ignored | dev.to | 2025-04-10

    Trino: Trino (formerly known as PrestoSQL) is a high-performance distributed SQL query engine designed for data analysis. It offers efficient querying capabilities across multiple data sources, including various file formats, databases, and data lakes. These are some interesting background story between Trino and Presto: Presto was the original name of the project, and it was developed by Facebook. In December 2020, a significant portion of the Presto community decided to fork the project and renamed it Trino. Read more here: Trino Blog.

  8. Hazelcast

    Hazelcast is a unified real-time data platform combining stream processing with a fast data store, allowing customers to act instantly on data-in-motion for real-time insights.

  9. CodeRabbit

    CodeRabbit: AI Code Reviews for Developers. Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR.

    CodeRabbit logo
  10. bookkeeper

    Apache BookKeeper - a scalable, fault tolerant and low latency storage service optimized for append-only workloads

  11. dslabs

    Distributed Systems Labs and Framework

    Project mention: Smurf: Beyond the Test Pyramid | news.ycombinator.com | 2024-10-19

    You'd define invariants that must be met. This has been done before.

    https://en.wikipedia.org/wiki/Search-based_software_engineer...

    e.g. testing implementations of Paxos: https://github.com/emichael/dslabs

  12. py4j

    Py4J enables Python programs to dynamically access arbitrary Java objects

    Project mention: Py4J: Enables Python programs to dynamically access arbitrary Java objects | news.ycombinator.com | 2025-04-12
  13. ScaleCube

    Microservices library - scalecube-services is a high throughput, low latency reactive microservices library built to scale. It features: API-Gateways, service-discovery, service-load-balancing, the architecture supports plug-and-play service communication modules and features. built to provide performance and low-latency real-time stream-processing

  14. scalardb

    Universal HTAP Engine

  15. swim

    Full stack application platform for building stateful microservices, streaming APIs, and real-time UIs

  16. Sparkler

    Spark-Crawler: Apache Nutch-like crawler that runs on Apache Spark.

  17. kronotop

    Kronotop is a distributed and transactional document database backed by FoundationDB.

    Project mention: Kronotop: Redis-compatible, transactional document store backed by FoundationDB | news.ycombinator.com | 2025-01-20
  18. MicroRaft

    Feature-complete implementation of the Raft consensus algorithm in Java

  19. pegasus

    Pegasus Workflow Management System - Automate, recover, and debug scientific computations. (by pegasus-isi)

    Project mention: Pegasus Workflow Management System | news.ycombinator.com | 2024-06-05
  20. diztl

    Share, discover & download files in your network 💥

  21. nosqlbench

    The open source, pluggable, nosql benchmarking suite.

  22. memq

    MemQ is an efficient, scalable cloud native PubSub system

  23. kafka-delayed-queue

    Delayed Queue implementation over Kafka

    Project mention: Show HN: Delayed Queue Using Kafka | news.ycombinator.com | 2024-07-08
  24. Vector-Clock

    An implementation of Vector Clock in Java :alarm_clock: (by varunu28)

  25. kafka-workflow

    Simple Workflow As Code on Kafka

  26. SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Java Distributed Systems discussion

Log in or Post with

Java Distributed Systems related posts

  • Py4J: Enables Python programs to dynamically access arbitrary Java objects

    1 project | news.ycombinator.com | 12 Apr 2025
  • System Design Resources

    1 project | news.ycombinator.com | 18 Sep 2024
  • Dubbo: A Robust Java RPC and Microservice Framework

    1 project | news.ycombinator.com | 8 Aug 2024
  • Maestro: Netflix's Workflow Orchestrator

    9 projects | news.ycombinator.com | 22 Jul 2024
  • Conductor – open-source event driven orchestration platform

    1 project | news.ycombinator.com | 4 Jul 2024
  • RAG Explained | Using Retrieval-Augmented Generation to Build Semantic Search

    1 project | dev.to | 13 Jun 2024
  • Emerging Tech Trends 2024: The Latest Developments in AI, API, and Automation

    1 project | dev.to | 17 May 2024
  • A note from our sponsor - CodeRabbit
    coderabbit.ai | 19 Apr 2025
    Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR. Learn more →

Index

What are some of the best open-source Distributed System projects in Java? This list will help you:

# Project Stars
1 advanced-java 77,469
2 Dubbo 40,904
3 awesome-system-design-resources 22,308
4 conductor 20,409
5 Apache ZooKeeper 12,451
6 Trino 11,124
7 Hazelcast 6,298
8 bookkeeper 1,922
9 dslabs 1,384
10 py4j 1,228
11 ScaleCube 622
12 scalardb 510
13 swim 493
14 Sparkler 412
15 kronotop 248
16 MicroRaft 232
17 pegasus 189
18 diztl 177
19 nosqlbench 176
20 memq 136
21 kafka-delayed-queue 39
22 Vector-Clock 11
23 kafka-workflow 9

Sponsored
InfluxDB high-performance time series database
Collect, organize, and act on massive volumes of high-resolution data to power real-time intelligent systems.
influxdata.com

Did you know that Java is
the 8th most popular programming language
based on number of references?