kop
doris
kop | doris | |
---|---|---|
4 | 48 | |
439 | 13,362 | |
- | 2.0% | |
0.0 | 10.0 | |
about 1 year ago | 5 days ago | |
Java | Java | |
Apache License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
kop
- Kafka-on-Pulsar Got Archived
-
Interview question
I process hospitality data somewhat similarly, but use Pulsar and can individually acknowledge messages, have DLQs built in, and if needed can stream just like Kafka. And if I need Kafka compatibility, I can use something like StreamNative's KOP and get Kafka compatibility over my existing Pulsar queues.
-
Improving Developer Productivity at Disney with Serverless and Open Source
While Pulsar has its own protocol that handles streaming, queues, and a lot of the other features (distributed transactions, functions, etc.), it can also speak other messaging protocols via plug-ins. Looking around, the ones that appear to be actively developed are MQTT, Kafka (so your existing applications that use Kafka can also use Pulsar), AMQP, and JMS.
-
How to import data from Apache Pulsar into Apache Doris quickly and seamlessly
git clone https://github.com/streamnative/kop.git cd kop
doris
- Apache Doris: open-source data warehouse for real time data analytics
-
Evolution of Data Sharding Towards Automation and Flexibility
Like in many databases, Apache Doris shards data into partitions, and then a partition is further divided into buckets. Partitions are typically defined by time or other continuous values. This allows query engines to quickly locate the target data during queries by pruning irrelevant data ranges.
-
Steps to industry-leading query speed: evolution of the Apache Doris execution engine
What makes a modern database system? The three key modules are query optimizer, execution engine, and storage engine. Among them, the role of execution engine to the DBMS is like the chef to a restaurant. This article focuses on the execution engine of the Apache Doris data warehouse, explaining the secret to its high performance.
-
Apache Doris for log and time series data analysis in NetEase, why not Elasticsearch and InfluxDB?
For most people looking for a log management and analytics solution, Elasticsearch is the go-to choice. The same applies to InfluxDB for time series data analysis. These were exactly the choices of NetEase, one of the world's highest-yielding game companies but more than that. As NetEase expands its business horizons, the logs and time series data it receives explode, and problems like surging storage costs and declining stability come. As NetEase's pick among all big data components for platform upgrades, Apache Doris fits into both scenarios and brings much faster query performance.
-
Multi-tenant workload isolation in Apache Doris: a better balance between isolation and utilization
This is an in-depth introduction to the workload isolation capabilities of Apache Doris. But first of all, why and when do you need workload isolation? If you relate to any of the following situations, read on and you will end up with a solution:
-
SQL Convertor for Easy Migration from Presto, Trino, ClickHouse, and Hive to Apache Doris
Apache Doris is an all-in-one data platform that is capable of real-time reporting, ad-hoc queries, data lakehousing, log management and analysis, and batch data processing. As more and more companies have been replacing their component-heavy data architecture with Apache Doris, there is an increasing need for a more convenient data migration solution. That's why the Doris SQL Convertor is made.
-
Variant in Apache Doris 2.1.0: a new data type 8 times faster than JSON for semi-structured data analysis
As an open-source real-time data warehouse, Apache Doris provides semi-structured data processing capabilities, and the newly-released version 2.1.0 makes a stride in this direction. Before V2.1, Apache Doris stores semi-structured data as JSON files. However, during query execution, the real-time parsing of JSON data leads to high CPU and I/O consumption in addition to high query latency, especially when the dataset is huge and complicated. Moreover, the lack of a pre-defined schema means there is no handle for query optimization.
-
Five Apache projects you probably didn't know about
Apache Doris is a real-time data warehouse.
-
Log Analysis: Elasticsearch VS Apache Doris
Learn more about Apache Doris or find the Doris makers on Slack.
-
Replacing Apache Hive, Elasticsearch, and PostgreSQL With Apache Doris
As you can imagine, a long and complicated data pipeline is high-maintenance and detrimental to development efficiency. Moreover, they are not capable of ad-hoc queries. So as an upgrade to our data warehouse, we replaced most of these components with Apache Doris, a unified analytic database.
What are some alternatives?
mop - MQTT on Pulsar implemented using Pulsar Protocol Handler
starrocks - The world's fastest open query engine for sub-second analytics both on and off the data lakehouse. With the flexibility to support nearly any scenario, StarRocks provides best-in-class performance for multi-dimensional analytics, real-time analytics, and ad-hoc queries. A Linux Foundation project.
aop - AMQP on Pulsar protocol handler
Trino - Official repository of Trino, the distributed SQL query engine for big data, former
pulsar-jms - DataStax Starlight for JMS, a JMS API for Apache Pulsar ®
incubator-xtable - Apache XTable (incubating) is a cross-table converter for lakehouse table formats that facilitates interoperability across data processing systems and query engines.