Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →
Top 13 Java Hive Projects
-
APIJSON
🏆 零代码、全功能、强安全 ORM 库 🚀 后端接口和文档零代码,前端(客户端) 定制返回 JSON 的数据和结构。 🏆 A JSON Transmission Protocol and an ORM Library 🚀 provides APIs and Docs without writing any code.
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
Trino
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
-
linkis
Apache Linkis builds a computation middleware layer to facilitate connection, governance and orchestration between the upper applications and the underlying data engines.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
helicalinsight
Helical Insight software is world’s first Open Source Business Intelligence framework which helps you to make sense out of your data and make well informed decisions.
-
waggle-dance
Hive federation service. Enables disparate tables to be concurrently accessed across multiple Hive deployments.
-
dataCompare
big data comparison and data profiling platform: low code,data comparison and data profiling
-
hadoopcryptoledger
Hadoop Crypto Ledger - Analyzing CryptoLedgers, such as Bitcoin Blockchain, on Big Data platforms, such as Hadoop/Spark/Flink/Hive
We have some of this functionality in Presto (https://github.com/prestodb/presto), but it takes fair bit of work to implement it for all the different backends.
Project mention: Variant in Apache Doris 2.1.0: a new data type 8 times faster than JSON for semi-structured data analysis | dev.to | 2024-03-27As an open-source real-time data warehouse, Apache Doris provides semi-structured data processing capabilities, and the newly-released version 2.1.0 makes a stride in this direction. Before V2.1, Apache Doris stores semi-structured data as JSON files. However, during query execution, the real-time parsing of JSON data leads to high CPU and I/O consumption in addition to high query latency, especially when the dataset is huge and complicated. Moreover, the lack of a pre-defined schema means there is no handle for query optimization.
Project mention: Trino: Fast distributed SQL query engine for big data analytics | news.ycombinator.com | 2024-03-19
Project mention: Git Query Language (GQL) Aggregation Functions, Groups, Alias | /r/ProgrammingLanguages | 2023-06-30Also are you familiar with apache drill . The idea is to put an SQL interpreter in front of any kind of database just like you are doing for git here.
Java Hive related posts
- Trino: Fast distributed SQL query engine for big data analytics
- Game analytic power: how we process more than 1 billion events per day
- Your Thoughts on OLAPs Clickhouse vs Apache Druid vs Starrocks in 2023/2024
- Log Analysis: Elasticsearch VS Apache Doris
- Ask HN: What are some SQL transpilers?
- Trino, a open query engine that runs at ludicrous speed
- Questions about Athena, Trino and Iceberg
-
A note from our sponsor - InfluxDB
www.influxdata.com | 24 Apr 2024
Index
What are some of the best open-source Hive projects in Java? This list will help you:
Project | Stars | |
---|---|---|
1 | APIJSON | 16,643 |
2 | Presto | 15,582 |
3 | doris | 11,314 |
4 | Trino | 9,552 |
5 | Apache Hive | 5,320 |
6 | linkis | 3,227 |
7 | Apache Drill | 1,891 |
8 | yauaa | 726 |
9 | helicalinsight | 282 |
10 | waggle-dance | 258 |
11 | dataCompare | 233 |
12 | hadoopcryptoledger | 141 |
13 | beekeeper | 44 |
Sponsored