Java Bigdata

Open-source Java projects categorized as Bigdata

Top 9 Java Bigdata Projects

  1. shardingsphere

    Empowering Data Intelligence with Distributed SQL for Sharding, Scalability, and Security Across All Databases.

  2. SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
  3. hudi

    Upserts, Deletes And Incremental Processing on Big Data.

    Project mention: Top Open-Source Data Engineering Tools- Unravelling the Best in 2026 | dev.to | 2025-12-10

    Apache Hudi

  4. Apache Avro

    Apache Avro is a data serialization system.

    Project mention: Better Than JSON | news.ycombinator.com | 2025-12-01
  5. odd-platform

    First open-source data discovery and observability platform. We make a life for data practitioners easy so you can focus on your business.

  6. celeborn

    Apache Celeborn is an elastic and high-performance service for shuffle and spilled data. (by apache)

    Project mention: Apache Celeborn: elastic high-performance service for shuffle and spilled data | news.ycombinator.com | 2026-01-16
  7. dataCompare

    big data comparison and data profiling platform: low code,data comparison and data profiling

  8. big-data-pipeline-lambda-arch

    A hybrid Big Data pipeline architecture that combines a real-time streaming layer with a batch layer to process large datasets(Lambda Architecture)

  9. rapiddweller-benerator-ce

    BENERATOR is a leading software solution to generate, obfuscate, pseudonymize and migrate data for development, testing, and training purposes with a model-driven approach.

  10. dms

    open-source, free, and AI-powered intelligent data management system,supports AI and compatible with multiple databases including MySQL, Oracle, PostgreSQL, Doris, etc. (by basedt)

    Project mention: BaseDMS - An open-source, intelligent, AI-powered data management system based on browser | dev.to | 2025-08-14

    BaseDMS is an open-source, free, and AI-powered intelligent data management system. It provides a web-based SQL editor for querying and managing database objects, and supports AI assisted development. Currently, it is compatible with more than 10 datasource including MySQL, Oracle, PostgreSQL, Apache Doris,Apache Hive and more.

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Java Bigdata discussion

Log in or Post with

Java Bigdata related posts

Index

What are some of the best open-source Bigdata projects in Java? This list will help you:

# Project Stars
1 shardingsphere 20,729
2 hudi 6,166
3 Apache Avro 3,271
4 odd-platform 1,408
5 celeborn 1,051
6 dataCompare 279
7 big-data-pipeline-lambda-arch 190
8 rapiddweller-benerator-ce 158
9 dms 44

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com