Top 13 Java Hive Projects

APIJSON

0 16,643 8.4 Java

🏆 零代码、全功能、强安全 ORM 库 🚀 后端接口和文档零代码，前端(客户端) 定制返回 JSON 的数据和结构。 🏆 A JSON Transmission Protocol and an ORM Library 🚀 provides APIs and Docs without writing any code.
Presto

14 15,582 9.9 Java

The official home of the Presto distributed SQL query engine for big data

Project mention: Multi-Database Support in DuckDB | news.ycombinator.com | 2024-01-28

We have some of this functionality in Presto (https://github.com/prestodb/presto), but it takes fair bit of work to implement it for all the different backends.

WorkOS

workos.com sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
doris

42 11,314 10.0 Java

Apache Doris is an easy-to-use, high performance and unified analytics database.

Project mention: Variant in Apache Doris 2.1.0: a new data type 8 times faster than JSON for semi-structured data analysis | dev.to | 2024-03-27

As an open-source real-time data warehouse, Apache Doris provides semi-structured data processing capabilities, and the newly-released version 2.1.0 makes a stride in this direction. Before V2.1, Apache Doris stores semi-structured data as JSON files. However, during query execution, the real-time parsing of JSON data leads to high CPU and I/O consumption in addition to high query latency, especially when the dataset is huge and complicated. Moreover, the lack of a pre-defined schema means there is no handle for query optimization.

Trino

44 9,552 10.0 Java

Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)

Project mention: Trino: Fast distributed SQL query engine for big data analytics | news.ycombinator.com | 2024-03-19

Apache Hive

14 5,320 9.6 Java

Apache Hive
linkis

2 3,227 9.5 Java

Apache Linkis builds a computation middleware layer to facilitate connection, governance and orchestration between the upper applications and the underlying data engines.
Apache Drill

9 1,891 8.2 Java

Apache Drill is a distributed MPP query layer for self describing data (by apache)

Project mention: Git Query Language (GQL) Aggregation Functions, Groups, Alias | /r/ProgrammingLanguages | 2023-06-30

Also are you familiar with apache drill . The idea is to put an SQL interpreter in front of any kind of database just like you are doing for git here.

InfluxDB

www.influxdata.com sponsored

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
yauaa

2 726 9.7 Java

Yet Another UserAgent Analyzer
helicalinsight

1 282 0.0 Java

Helical Insight software is world’s first Open Source Business Intelligence framework which helps you to make sense out of your data and make well informed decisions.
waggle-dance

1 258 7.7 Java

Hive federation service. Enables disparate tables to be concurrently accessed across multiple Hive deployments.
dataCompare

1 233 3.7 Java

big data comparison and data profiling platform: low code，data comparison and data profiling
hadoopcryptoledger

7 141 1.8 Java

Hadoop Crypto Ledger - Analyzing CryptoLedgers, such as Bitcoin Blockchain, on Big Data platforms, such as Hadoop/Spark/Flink/Hive
beekeeper

1 44 4.5 Java

Service for automatically managing and cleaning up unreferenced data (by ExpediaGroup)

Project mention: FLaNK Stack 26 February 2024 | dev.to | 2024-02-26

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Java Hive related posts

Trino: Fast distributed SQL query engine for big data analytics
1 project | news.ycombinator.com | 19 Mar 2024
Game analytic power: how we process more than 1 billion events per day
1 project | dev.to | 24 Nov 2023
Your Thoughts on OLAPs Clickhouse vs Apache Druid vs Starrocks in 2023/2024
2 projects | /r/dataengineering | 16 Nov 2023
Log Analysis: Elasticsearch VS Apache Doris
1 project | dev.to | 16 Oct 2023
Ask HN: What are some SQL transpilers?
2 projects | news.ycombinator.com | 14 Jul 2023
Trino, a open query engine that runs at ludicrous speed
1 project | news.ycombinator.com | 11 Jul 2023
Questions about Athena, Trino and Iceberg
2 projects | /r/dataengineering | 15 Jun 2023
A note from our sponsor - InfluxDB
www.influxdata.com | 24 Apr 2024

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →

Index

What are some of the best open-source Hive projects in Java? This list will help you:

	Project	Stars
1	APIJSON	16,643
2	Presto	15,582
3	doris	11,314
4	Trino	9,552
5	Apache Hive	5,320
6	linkis	3,227
7	Apache Drill	1,891
8	yauaa	726
9	helicalinsight	282
10	waggle-dance	258
11	dataCompare	233
12	hadoopcryptoledger	141
13	beekeeper	44