Java Bigdata

Open-source Java projects categorized as Bigdata

Top 8 Java Bigdata Projects

  • shardingsphere

    Distributed SQL transaction & query engine for data sharding, scaling, encryption, and more - on any database.

  • Project mention: Managing Data Residency - the demo | dev.to | 2023-05-25

    Opposite to what the documentation tells, the full prefix is jdbc:shardingsphere:absolutepath. I've opened a PR to fix the documentation.

  • hudi

    Upserts, Deletes And Incremental Processing on Big Data.

  • Project mention: Getting Started with Flink SQL, Apache Iceberg and DynamoDB Catalog | dev.to | 2023-12-18

    Apache Iceberg is one of the three types of lakehouse, the other two are Apache Hudi and Delta Lake.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • Apache Avro

    Apache Avro is a data serialization system.

  • Project mention: Open Table Formats Such as Apache Iceberg Are Inevitable for Analytical Data | news.ycombinator.com | 2024-01-18

    Apache AVRO [1] is one but it has been largely replaced by Parquet [2] which is a hybrid row/columnar format

    [1] https://avro.apache.org/

  • odd-platform

    First open-source data discovery and observability platform. We make a life for data practitioners easy so you can focus on your business.

  • Project mention: OpenDataDiscovery 0.15 with Data Deprecation and Metadata Stale | news.ycombinator.com | 2023-08-04
  • dataCompare

    big data comparison and data profiling platform: low code,data comparison and data profiling

  • big-data-pipeline-lambda-arch

    A full big data pipeline (Lambda Architecture) with Spark, Kafka, HDFS and Cassandra.

  • hadoopcryptoledger

    Hadoop Crypto Ledger - Analyzing CryptoLedgers, such as Bitcoin Blockchain, on Big Data platforms, such as Hadoop/Spark/Flink/Hive

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • rapiddweller-benerator-ce

    BENERATOR is a leading software solution to generate, obfuscate, pseudonymize and migrate data for development, testing, and training purposes with a model-driven approach.

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Java Bigdata related posts

Index

What are some of the best open-source Bigdata projects in Java? This list will help you:

Project Stars
1 shardingsphere 19,425
2 hudi 5,053
3 Apache Avro 2,764
4 odd-platform 1,108
5 dataCompare 234
6 big-data-pipeline-lambda-arch 161
7 hadoopcryptoledger 141
8 rapiddweller-benerator-ce 128

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com