Hive

Open-source projects categorized as Hive

Top 23 Hive Open-Source Projects

  • cube.js

    📊 Cube — The Semantic Layer for Building Data Applications

  • Project mention: MQL – Client and Server to query your DB in natural language | news.ycombinator.com | 2024-04-07

    I should have clarified. There's a large number of apps that are:

    1. taking info strictly from SQL (e.g. information_schema, query history)

    2. taking a user input / question

    3. writing SQL to answer that question

    An app like this is what I call "text-to-sql". Totally agree a better system would pull in additional documentation (which is what we're doing), but I'd no longer consider it "text-to-sql". In our case, we're not even directly writing SQL, but rather generating semantic layer queries (i.e. https://cube.dev/).

  • APIJSON

    🏆 零代码、全功能、强安全 ORM 库 🚀 后端接口和文档零代码,前端(客户端) 定制返回 JSON 的数据和结构。 🏆 A JSON Transmission Protocol and an ORM Library 🚀 provides APIs and Docs without writing any code.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • Presto

    The official home of the Presto distributed SQL query engine for big data

  • Project mention: Multi-Database Support in DuckDB | news.ycombinator.com | 2024-01-28

    We have some of this functionality in Presto (https://github.com/prestodb/presto), but it takes fair bit of work to implement it for all the different backends.

  • doris

    Apache Doris is an easy-to-use, high performance and unified analytics database.

  • Project mention: Variant in Apache Doris 2.1.0: a new data type 8 times faster than JSON for semi-structured data analysis | dev.to | 2024-03-27

    As an open-source real-time data warehouse, Apache Doris provides semi-structured data processing capabilities, and the newly-released version 2.1.0 makes a stride in this direction. Before V2.1, Apache Doris stores semi-structured data as JSON files. However, during query execution, the real-time parsing of JSON data leads to high CPU and I/O consumption in addition to high query latency, especially when the dataset is huge and complicated. Moreover, the lack of a pre-defined schema means there is no handle for query optimization.

  • Trino

    Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)

  • Project mention: Trino: Fast distributed SQL query engine for big data analytics | news.ycombinator.com | 2024-03-19
  • sqlglot

    Python SQL Parser and Transpiler

  • Project mention: The Future of MySQL is PostgreSQL: an extension for the MySQL wire protocol | news.ycombinator.com | 2024-04-26

    This is probably referring to "zero changes to your driver code" and not "zero changes to the SQL you send over this driver".

    Translating between SQL dialects is notoriously hard and attempts to translate [1] are working in 95% of cases. But the last 5% would require 5x amount of work. That's because "SQL dialect" also includes weird edge cases of type inference of things like COALESCE(5, FALSE) and emulation of system catalogs (pg_catalog, information_schema).

    [1] https://github.com/tobymao/sqlglot

  • Apache Hive

    Apache Hive

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • Hive

    Lightweight and blazing fast key-value database written in pure Dart. (by isar)

  • linkis

    Apache Linkis builds a computation middleware layer to facilitate connection, governance and orchestration between the upper applications and the underlying data engines.

  • kyuubi

    Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.

  • Apache Drill

    Apache Drill is a distributed MPP query layer for self describing data (by apache)

  • Project mention: Git Query Language (GQL) Aggregation Functions, Groups, Alias | /r/ProgrammingLanguages | 2023-06-30

    Also are you familiar with apache drill . The idea is to put an SQL interpreter in front of any kind of database just like you are doing for git here.

  • querybook

    Querybook is a Big Data Querying UI, combining collocated table metadata and a simple notebook interface.

  • PyHive

    Python interface to Hive and Presto. 🐝

  • yauaa

    Yet Another UserAgent Analyzer

  • WeDataSphere

    WeDataSphere is a financial grade, one-stop big data platform suite.

  • jupysql

    Better SQL in Jupyter. 📊

  • Project mention: Show HN: JupySQL – a SQL client for Jupyter (ipython-SQL successor) | news.ycombinator.com | 2023-12-06

    Hey, HN community!

    We're stoked to launch JupySQL today! JupySQL is an open-source library that brings a modern SQL experience to Jupyter. JupySQL is compatible with all major databases, such as Snowflake, Redshift, PostgreSQL, MySQL, MariaDB, DuckDB, SQL Server, Clickhouse, Trino, and more!

    To get started, check out our tutorial: https://jupysql.ploomber.io/en/latest/quick-start.html

    SQL is the defacto language for data analysis; however, analysis often requires a mix of SQL and Python. JupySQL bridges this gap, allowing users to execute SQL queries seamlessly in Jupyter and continue their analysis in Python. Add %%sql to the top of your cell and start writing SQL.

    Here are some of JupySQL's main features:

    - Syntax highlighting

  • mlcraft

    Synmetrix – open source semantic layer / Boost your LLM precision

  • Project mention: Show HN: Synmetrix – Open-Source Platform for Data and Metrics Management | news.ycombinator.com | 2024-02-28
  • MovieLab

    An open source movie tracker and movie finder.

  • hive

    Fast. Scalable. Powerful. The Blockchain for Web3 (by openhive-network)

  • Project mention: Welcome to r/DBuzzWorld - READ This to Get Started! | /r/dbuzzworld | 2023-05-16
  • helicalinsight

    Helical Insight software is world’s first Open Source Business Intelligence framework which helps you to make sense out of your data and make well informed decisions.

  • waggle-dance

    Hive federation service. Enables disparate tables to be concurrently accessed across multiple Hive deployments.

  • dataCompare

    big data comparison and data profiling platform: low code,data comparison and data profiling

  • bitalarm

    An app to keep track of different cryptocurrencies, written in dart + flutter

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Hive related posts

Index

What are some of the best open-source Hive projects? This list will help you:

Project Stars
1 cube.js 17,135
2 APIJSON 16,643
3 Presto 15,591
4 doris 11,314
5 Trino 9,552
6 sqlglot 5,441
7 Apache Hive 5,326
8 Hive 3,874
9 linkis 3,227
10 kyuubi 1,928
11 Apache Drill 1,894
12 querybook 1,737
13 PyHive 1,665
14 yauaa 728
15 WeDataSphere 633
16 jupysql 598
17 mlcraft 467
18 MovieLab 385
19 hive 323
20 helicalinsight 282
21 waggle-dance 258
22 dataCompare 234
23 bitalarm 190

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com