SaaSHub helps you find the best software and product alternatives Learn more →
Top 17 Java Hadoop Projects
-
APIJSON
🏆 实时 零代码、全功能、强安全 ORM 库 🚀 后端接口和文档零代码,前端(客户端) 定制返回 JSON 的数据和结构 🏆 Real-Time coding-free, powerful and secure ORM 🚀 providing APIs and Docs without coding by Backend, and the returned JSON of API can be customized by Frontend(Client) users
Project mention: Top 15 Open-Source Low-Code Projects with the Most GitHub Stars | dev.to | 2024-07-18GitHub https://github.com/Tencent/APIJSON GitHub Stars 16.9k Most Recent Update on GitHub 2 days ago Open Source License Apache 2.0 Number of Active Contributors This Year 6 Acceptance of External PRs Yes Official Website http://apijson.cn/ Documentation https://apijsondocs.readthedocs.io/en/latest/
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
Project mention: Using IRIS and Presto for high-performance and scalable SQL queries | dev.to | 2025-01-19
The rise of Big Data projects, real-time self-service analytics, online query services, and social networks, among others, have enabled scenarios for massive and high-performance data queries. In response to this challenge, MPP (massively parallel processing database) technology was created, and it quickly established itself. Among the open-source MPP options, Presto (https://prestodb.io/) is the best-known option. It originated in Facebook and was utilized for data analytics, but later became open-sourced. However, since Teradata has joined the Presto community, it offers support now.
-
During my time with Tublian, I learned a valuable lesson about focus. Instead of jumping between different repositories, I concentrated on making meaningful contributions to just a few, including Apache and two others. This approach wasn't random - it came from the amazing mentorship I received from the Open Sauced community. Huge shoutout to @Bekah, @Chrissy, @ayu, and @Jeffrey for teaching me that consistency beats quantity any day!
-
Deeplearning4j
Suite of tools for deploying and training deep learning models using the JVM. Highlights include model import for keras, tensorflow, and onnx/pytorch, a modular and tiny c++ library for running math code and a java based math library on top of the core c++ library. Also includes samediff: a pytorch/tensorflow like library for running deep learn...
-
Project mention: Apache Doris: open-source data warehouse for real time data analytics | news.ycombinator.com | 2024-10-26
-
Project mention: Trino: A fast distributed SQL query engine for big data analytics | news.ycombinator.com | 2024-07-09
-
Alluxio (formerly Tachyon)
Alluxio, data orchestration for analytics and machine learning in the cloud
-
Project mention: Hive: An Open-Source Data Warehouse Built on Apache Hadoop | news.ycombinator.com | 2024-08-13
-
Apache Ignite — Free and open-source, Apache Ignite is a horizontally scalable key-value cache store system with a robust multi-model database that powers APIs to compute distributed data. Ignite provides a security system that can authenticate users' credentials on the server. It can also be used for system workload acceleration, real-time data processing, analytics, and as a graph-centric programming model.
-
-
Language: Java | GitHub: 2.9K+ stars | link
-
-
kylo
Kylo is a data lake management software platform and framework for enabling scalable enterprise-class data lakes on big data technologies such as Teradata, Apache Spark and/or Hadoop. Kylo is licensed under Apache 2.0. Contributed by Teradata Inc.
-
ozone
Scalable, reliable, distributed storage system optimized for data analytics and object store workloads.
Project mention: Apache Ozone: Scalable, redundant, distributed object store for Apache Hadoop | news.ycombinator.com | 2024-12-04 -
-
-
hadoopcryptoledger
Hadoop Crypto Ledger - Analyzing CryptoLedgers, such as Bitcoin Blockchain, on Big Data platforms, such as Hadoop/Spark/Flink/Hive
Java Hadoop discussion
Java Hadoop related posts
-
Commit to Growth: My 2024 Reflection
-
Where is Java Used in Industry?
-
Apache Ozone: Scalable, redundant, distributed object store for Apache Hadoop
-
Apache Doris: open-source data warehouse for real time data analytics
-
Evolution of Data Sharding Towards Automation and Flexibility
-
Hadoop Installation and Deployment Guide
-
Steps to industry-leading query speed: evolution of the Apache Doris execution engine
-
A note from our sponsor - SaaSHub
www.saashub.com | 20 Jan 2025
Index
What are some of the best open-source Hadoop projects in Java? This list will help you:
# | Project | Stars |
---|---|---|
1 | APIJSON | 17,382 |
2 | Presto | 16,158 |
3 | Apache Hadoop | 14,889 |
4 | Deeplearning4j | 13,745 |
5 | doris | 13,052 |
6 | Trino | 10,726 |
7 | Alluxio (formerly Tachyon) | 6,904 |
8 | Apache Hive | 5,613 |
9 | Apache Ignite | 4,855 |
10 | Apache Calcite | 4,694 |
11 | Apache Nutch | 2,961 |
12 | Apache Drill | 1,953 |
13 | kylo | 1,111 |
14 | ozone | 866 |
15 | venice | 501 |
16 | incubator-wayang | 214 |
17 | hadoopcryptoledger | 141 |