doris
kop
doris | kop | |
---|---|---|
42 | 4 | |
11,363 | 439 | |
1.6% | - | |
10.0 | 0.0 | |
5 days ago | 3 months ago | |
Java | Java | |
Apache License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
doris
-
Variant in Apache Doris 2.1.0: a new data type 8 times faster than JSON for semi-structured data analysis
As an open-source real-time data warehouse, Apache Doris provides semi-structured data processing capabilities, and the newly-released version 2.1.0 makes a stride in this direction. Before V2.1, Apache Doris stores semi-structured data as JSON files. However, during query execution, the real-time parsing of JSON data leads to high CPU and I/O consumption in addition to high query latency, especially when the dataset is huge and complicated. Moreover, the lack of a pre-defined schema means there is no handle for query optimization.
-
Five Apache projects you probably didn't know about
Apache Doris is a real-time data warehouse.
-
Log Analysis: Elasticsearch VS Apache Doris
Learn more about Apache Doris or find the Doris makers on Slack.
-
Replacing Apache Hive, Elasticsearch, and PostgreSQL With Apache Doris
As you can imagine, a long and complicated data pipeline is high-maintenance and detrimental to development efficiency. Moreover, they are not capable of ad-hoc queries. So as an upgrade to our data warehouse, we replaced most of these components with Apache Doris, a unified analytic database.
-
Apache Doris 2.0 Beta Now Available: Faster, Stabler, and More Versatile
GitHub source code: https://github.com/apache/doris/tree/branch-2.0
-
A/B Testing was a handful
The key to Architecture 3.0 is the combination of Flink and Doris, so this is how to connect them. Probably the most important code in building architecture 3. flink-demo stream-load-demo
-
Ask HN: Are there any notable Chinese FLOSS projects?
https://github.com/apache/doris Is a great example. Same for it's cousin https://github.com/StarRocks/starrocks that was an early fork of the doris project.
To be fair, these are the only examples I can think of and I only learned of these as I'm standing up new data infra using starrocks.
- Apache Doris 2.0.0 Alpha Released
-
30,000 QPS Per Node: How We Increased Database Query Concurrency by 20 Times
We optimized Apache Doris to solve these problems. (Pull Request on Github)
-
Beginner's Guide to Data Analytics: Diving into Our Data Management Platform
So, in Storage Architecture 2.0, we introduced Apache Doris and Apache Spark. The whole data pipeline was a Y-shaped diagram.
kop
- Kafka-on-Pulsar Got Archived
-
Interview question
I process hospitality data somewhat similarly, but use Pulsar and can individually acknowledge messages, have DLQs built in, and if needed can stream just like Kafka. And if I need Kafka compatibility, I can use something like StreamNative's KOP and get Kafka compatibility over my existing Pulsar queues.
-
Improving Developer Productivity at Disney with Serverless and Open Source
While Pulsar has its own protocol that handles streaming, queues, and a lot of the other features (distributed transactions, functions, etc.), it can also speak other messaging protocols via plug-ins. Looking around, the ones that appear to be actively developed are MQTT, Kafka (so your existing applications that use Kafka can also use Pulsar), AMQP, and JMS.
-
How to import data from Apache Pulsar into Apache Doris quickly and seamlessly
git clone https://github.com/streamnative/kop.git cd kop
What are some alternatives?
starrocks - StarRocks, a Linux Foundation project, is a next-generation sub-second MPP OLAP database for full analytics scenarios, including multi-dimensional analytics, real-time analytics, and ad-hoc queries. InfoWorld’s 2023 BOSSIE Award for best open source software.
mop - MQTT on Pulsar implemented using Pulsar Protocol Handler
tools
debezium - Change data capture for a variety of databases. Please log issues at https://issues.redhat.com/browse/DBZ.
Trino - Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
eventmesh - EventMesh is a new generation serverless event middleware for building distributed event-driven applications.
Boost-Pretty-Printer - GDB Pretty Printers for Boost
pulsar-jms - DataStax Starlight for JMS, a JMS API for Apache Pulsar ®
esphome-yeelight-ceiling-light - ESPHome custom firmware for some Yeelight Ceiling Lights
Oryx 2 - Oryx 2: Lambda architecture on Apache Spark, Apache Kafka for real-time large scale machine learning
dorisw
kafdrop - Kafka Web UI