Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →
Top 23 Hive Open-Source Projects
-
APIJSON
🏆 零代码、全功能、强安全 ORM 库 🚀 后端接口和文档零代码,前端(客户端) 定制返回 JSON 的数据和结构。 🏆 A JSON Transmission Protocol and an ORM Library 🚀 provides APIs and Docs without writing any code.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
Trino
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
linkis
Apache Linkis builds a computation middleware layer to facilitate connection, governance and orchestration between the upper applications and the underlying data engines.
-
kyuubi
Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.
-
querybook
Querybook is a Big Data Querying UI, combining collocated table metadata and a simple notebook interface.
-
helicalinsight
Helical Insight software is world’s first Open Source Business Intelligence framework which helps you to make sense out of your data and make well informed decisions.
-
waggle-dance
Hive federation service. Enables disparate tables to be concurrently accessed across multiple Hive deployments.
-
dataCompare
big data comparison and data profiling platform: low code,data comparison and data profiling
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Project mention: MQL – Client and Server to query your DB in natural language | news.ycombinator.com | 2024-04-07I should have clarified. There's a large number of apps that are:
1. taking info strictly from SQL (e.g. information_schema, query history)
2. taking a user input / question
3. writing SQL to answer that question
An app like this is what I call "text-to-sql". Totally agree a better system would pull in additional documentation (which is what we're doing), but I'd no longer consider it "text-to-sql". In our case, we're not even directly writing SQL, but rather generating semantic layer queries (i.e. https://cube.dev/).
We have some of this functionality in Presto (https://github.com/prestodb/presto), but it takes fair bit of work to implement it for all the different backends.
Project mention: Variant in Apache Doris 2.1.0: a new data type 8 times faster than JSON for semi-structured data analysis | dev.to | 2024-03-27As an open-source real-time data warehouse, Apache Doris provides semi-structured data processing capabilities, and the newly-released version 2.1.0 makes a stride in this direction. Before V2.1, Apache Doris stores semi-structured data as JSON files. However, during query execution, the real-time parsing of JSON data leads to high CPU and I/O consumption in addition to high query latency, especially when the dataset is huge and complicated. Moreover, the lack of a pre-defined schema means there is no handle for query optimization.
Project mention: Trino: Fast distributed SQL query engine for big data analytics | news.ycombinator.com | 2024-03-19
Project mention: The Future of MySQL is PostgreSQL: an extension for the MySQL wire protocol | news.ycombinator.com | 2024-04-26This is probably referring to "zero changes to your driver code" and not "zero changes to the SQL you send over this driver".
Translating between SQL dialects is notoriously hard and attempts to translate [1] are working in 95% of cases. But the last 5% would require 5x amount of work. That's because "SQL dialect" also includes weird edge cases of type inference of things like COALESCE(5, FALSE) and emulation of system catalogs (pg_catalog, information_schema).
[1] https://github.com/tobymao/sqlglot
Project mention: Git Query Language (GQL) Aggregation Functions, Groups, Alias | /r/ProgrammingLanguages | 2023-06-30Also are you familiar with apache drill . The idea is to put an SQL interpreter in front of any kind of database just like you are doing for git here.
Project mention: Show HN: JupySQL – a SQL client for Jupyter (ipython-SQL successor) | news.ycombinator.com | 2023-12-06Hey, HN community!
We're stoked to launch JupySQL today! JupySQL is an open-source library that brings a modern SQL experience to Jupyter. JupySQL is compatible with all major databases, such as Snowflake, Redshift, PostgreSQL, MySQL, MariaDB, DuckDB, SQL Server, Clickhouse, Trino, and more!
To get started, check out our tutorial: https://jupysql.ploomber.io/en/latest/quick-start.html
SQL is the defacto language for data analysis; however, analysis often requires a mix of SQL and Python. JupySQL bridges this gap, allowing users to execute SQL queries seamlessly in Jupyter and continue their analysis in Python. Add %%sql to the top of your cell and start writing SQL.
Here are some of JupySQL's main features:
- Syntax highlighting
Project mention: Show HN: Synmetrix – Open-Source Platform for Data and Metrics Management | news.ycombinator.com | 2024-02-28
Hive related posts
- Trino: Fast distributed SQL query engine for big data analytics
- Show HN: Synmetrix – Open-Source Platform for Data and Metrics Management
- Show HN: Synmetrix – Open Semantic Layer
- Game analytic power: how we process more than 1 billion events per day
- Your Thoughts on OLAPs Clickhouse vs Apache Druid vs Starrocks in 2023/2024
- Hexagonal Grids
- Log Analysis: Elasticsearch VS Apache Doris
-
A note from our sponsor - InfluxDB
www.influxdata.com | 26 Apr 2024
Index
What are some of the best open-source Hive projects? This list will help you:
Project | Stars | |
---|---|---|
1 | cube.js | 17,135 |
2 | APIJSON | 16,643 |
3 | Presto | 15,591 |
4 | doris | 11,314 |
5 | Trino | 9,552 |
6 | sqlglot | 5,441 |
7 | Apache Hive | 5,326 |
8 | Hive | 3,874 |
9 | linkis | 3,227 |
10 | kyuubi | 1,928 |
11 | Apache Drill | 1,894 |
12 | querybook | 1,737 |
13 | PyHive | 1,665 |
14 | yauaa | 728 |
15 | WeDataSphere | 633 |
16 | jupysql | 598 |
17 | mlcraft | 467 |
18 | MovieLab | 385 |
19 | hive | 323 |
20 | helicalinsight | 282 |
21 | waggle-dance | 258 |
22 | dataCompare | 234 |
23 | bitalarm | 190 |
Sponsored