Java iceberg

Open-source Java projects categorized as iceberg

Top 6 Java iceberg Projects

  • doris

    Apache Doris is an easy-to-use, high performance and unified analytics database.

  • Project mention: Variant in Apache Doris 2.1.0: a new data type 8 times faster than JSON for semi-structured data analysis | dev.to | 2024-03-27

    As an open-source real-time data warehouse, Apache Doris provides semi-structured data processing capabilities, and the newly-released version 2.1.0 makes a stride in this direction. Before V2.1, Apache Doris stores semi-structured data as JSON files. However, during query execution, the real-time parsing of JSON data leads to high CPU and I/O consumption in addition to high query latency, especially when the dataset is huge and complicated. Moreover, the lack of a pre-defined schema means there is no handle for query optimization.

  • Trino

    Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)

  • Project mention: Trino: Fast distributed SQL query engine for big data analytics | news.ycombinator.com | 2024-03-19
  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • starrocks

    StarRocks, a Linux Foundation project, is a next-generation sub-second MPP OLAP database for full analytics scenarios, including multi-dimensional analytics, real-time analytics, and ad-hoc queries. InfoWorld’s 2023 BOSSIE Award for best open source software.

  • Project mention: A MySQL compatible database engine written in pure Go | news.ycombinator.com | 2024-04-09

    tidb has been around for a while, it is distributed, written in Go and Rust, and MySQL compatible. https://github.com/pingcap/tidb

    Somewhat relatedly, StarRocks is also MySQL compatible, written in Java and C++, but it's tackling OLAP use-cases. https://github.com/StarRocks/starrocks

  • iceberg

    Apache Iceberg

  • Project mention: Iceberg won the table format war: But not in the way you thought it might | /r/dataengineering | 2023-07-06
  • nessie

    Nessie: Transactional Catalog for Data Lakes with Git-like semantics

  • Project mention: A deep dive into the concept and world of Apache Iceberg Catalogs | dev.to | 2024-03-01

    Nessie is an innovative open-source catalog that extends beyond the traditional catalog capabilities in the Apache Iceberg ecosystem, introducing git-like features to data management. This catalog not only tracks table metadata but also allows users to capture commits at a holistic level, enabling advanced operations such as multi-table transactions, rollbacks, branching, and tagging. These features provide a new layer of flexibility and control over data changes, resembling version control systems in software development.

  • openhouse

    Open Control Plane for Tables in Data Lakehouse

  • Project mention: Linkedin OpenHouse: Control Plane for Tables in Data Lakehouses | news.ycombinator.com | 2024-03-11
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Java iceberg related posts

Index

What are some of the best open-source iceberg projects in Java? This list will help you:

Project Stars
1 doris 11,314
2 Trino 9,552
3 starrocks 7,764
4 iceberg 5,508
5 nessie 831
6 openhouse 242

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com