SaaSHub helps you find the best software and product alternatives Learn more →
Top 9 Java Bigdata Projects
-
shardingsphere
Empowering Data Intelligence with Distributed SQL for Sharding, Scalability, and Security Across All Databases.
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
Project mention: Top Open-Source Data Engineering Tools- Unravelling the Best in 2026 | dev.to | 2025-12-10
Apache Hudi
-
-
odd-platform
First open-source data discovery and observability platform. We make a life for data practitioners easy so you can focus on your business.
-
celeborn
Apache Celeborn is an elastic and high-performance service for shuffle and spilled data. (by apache)
Project mention: Apache Celeborn: elastic high-performance service for shuffle and spilled data | news.ycombinator.com | 2026-01-16 -
dataCompare
big data comparison and data profiling platform: low code,data comparison and data profiling
-
big-data-pipeline-lambda-arch
A hybrid Big Data pipeline architecture that combines a real-time streaming layer with a batch layer to process large datasets(Lambda Architecture)
-
rapiddweller-benerator-ce
BENERATOR is a leading software solution to generate, obfuscate, pseudonymize and migrate data for development, testing, and training purposes with a model-driven approach.
-
dms
open-source, free, and AI-powered intelligent data management system,supports AI and compatible with multiple databases including MySQL, Oracle, PostgreSQL, Doris, etc. (by basedt)
Project mention: BaseDMS - An open-source, intelligent, AI-powered data management system based on browser | dev.to | 2025-08-14BaseDMS is an open-source, free, and AI-powered intelligent data management system. It provides a web-based SQL editor for querying and managing database objects, and supports AI assisted development. Currently, it is compatible with more than 10 datasource including MySQL, Oracle, PostgreSQL, Apache Doris,Apache Hive and more.
Java Bigdata discussion
Java Bigdata related posts
-
From Postgres to Iceberg
-
Apache Hudi: an open data lakehouse platform
-
Show HN: Apache Amoro is a Lakehouse management system built on iceberg
-
For those of you with Lakehouse Architectures, how do you handle duplicate records?
-
AWS ACID data lakehouse
-
hadoopcryptoledger: NEW Data - star count:139.0
-
hadoopcryptoledger: NEW Data - star count:139.0
-
A note from our sponsor - SaaSHub
www.saashub.com | 5 Jun 2026
Index
What are some of the best open-source Bigdata projects in Java? This list will help you:
| # | Project | Stars |
|---|---|---|
| 1 | shardingsphere | 20,729 |
| 2 | hudi | 6,166 |
| 3 | Apache Avro | 3,271 |
| 4 | odd-platform | 1,408 |
| 5 | celeborn | 1,051 |
| 6 | dataCompare | 279 |
| 7 | big-data-pipeline-lambda-arch | 190 |
| 8 | rapiddweller-benerator-ce | 158 |
| 9 | dms | 44 |