incubator-xtable
doris
incubator-xtable | doris | |
---|---|---|
5 | 42 | |
695 | 11,452 | |
10.8% | 2.3% | |
9.3 | 10.0 | |
1 day ago | 5 days ago | |
Java | Java | |
Apache License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
incubator-xtable
- FLaNK AI-April 22, 2024
- FLaNK AI Weekly 25 March 2025
- FLaNK Stack Weekly for 20 Nov 2023
-
OneTable is now live | Table format interoperability is not a dream anymore
I'm excited to share that the project is now live and I wanted to thank the project's early contributors. You can learn more by visiting the repo: https://github.com/onetable-io/onetable
- Show HN: OneTable interop across Delta, Hudi, Iceberg
doris
-
Variant in Apache Doris 2.1.0: a new data type 8 times faster than JSON for semi-structured data analysis
As an open-source real-time data warehouse, Apache Doris provides semi-structured data processing capabilities, and the newly-released version 2.1.0 makes a stride in this direction. Before V2.1, Apache Doris stores semi-structured data as JSON files. However, during query execution, the real-time parsing of JSON data leads to high CPU and I/O consumption in addition to high query latency, especially when the dataset is huge and complicated. Moreover, the lack of a pre-defined schema means there is no handle for query optimization.
-
Five Apache projects you probably didn't know about
Apache Doris is a real-time data warehouse.
-
Log Analysis: Elasticsearch VS Apache Doris
Learn more about Apache Doris or find the Doris makers on Slack.
-
Replacing Apache Hive, Elasticsearch, and PostgreSQL With Apache Doris
As you can imagine, a long and complicated data pipeline is high-maintenance and detrimental to development efficiency. Moreover, they are not capable of ad-hoc queries. So as an upgrade to our data warehouse, we replaced most of these components with Apache Doris, a unified analytic database.
-
Apache Doris 2.0 Beta Now Available: Faster, Stabler, and More Versatile
GitHub source code: https://github.com/apache/doris/tree/branch-2.0
-
A/B Testing was a handful
The key to Architecture 3.0 is the combination of Flink and Doris, so this is how to connect them. Probably the most important code in building architecture 3. flink-demo stream-load-demo
-
Ask HN: Are there any notable Chinese FLOSS projects?
https://github.com/apache/doris Is a great example. Same for it's cousin https://github.com/StarRocks/starrocks that was an early fork of the doris project.
To be fair, these are the only examples I can think of and I only learned of these as I'm standing up new data infra using starrocks.
- Apache Doris 2.0.0 Alpha Released
-
30,000 QPS Per Node: How We Increased Database Query Concurrency by 20 Times
We optimized Apache Doris to solve these problems. (Pull Request on Github)
-
Beginner's Guide to Data Analytics: Diving into Our Data Management Platform
So, in Storage Architecture 2.0, we introduced Apache Doris and Apache Spark. The whole data pipeline was a Y-shaped diagram.
What are some alternatives?
email2phonenumber - A OSINT tool to obtain a target's phone number just by having his email address
starrocks - StarRocks, a Linux Foundation project, is a next-generation sub-second MPP OLAP database for full analytics scenarios, including multi-dimensional analytics, real-time analytics, and ad-hoc queries. InfoWorld’s 2023 BOSSIE Award for best open source software.
tools
sql-cli-for-apache-flink-docker - SQL CLI for Apache Flink® via docker-compose
Trino - Official repository of Trino, the distributed SQL query engine for big data, former
connectors - This library allows Scala and Java-based projects (including Apache Flink, Apache Hive, Apache Beam, and PrestoDB) to read from and write to Delta Lake.
kop - Kafka-on-Pulsar - A protocol handler that brings native Kafka protocol to Apache Pulsar
Local-Data-LakeHouse - Sample Data Lakehouse deployed in Docker containers using Apache Iceberg, Minio, Trino and a Hive Metastore. Can be used for local testing.
Boost-Pretty-Printer - GDB Pretty Printers for Boost
matano - Open source security data lake for threat hunting, detection & response, and cybersecurity analytics at petabyte scale on AWS
esphome-yeelight-ceiling-light - ESPHome custom firmware for some Yeelight Ceiling Lights