kafka-delta-ingest
rust-rdkafka
Our great sponsors
kafka-delta-ingest | rust-rdkafka | |
---|---|---|
6 | 8 | |
318 | 1,469 | |
4.1% | - | |
7.5 | 8.4 | |
7 days ago | 6 days ago | |
Rust | Rust | |
Apache License 2.0 | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
kafka-delta-ingest
-
Using rust for DE activities?
Rust can offer incredible cost savings when you can use it in place of spark to interact with your delta lake. One such project was kafka-delta-ingest. The developers were able to reduce the cost of running the pipeline by over 90%. However, most of this stuff is still very experimental and not ready for production but you will definitely be seeing more projects like this just based on how much money can be saved.
-
Which lakehouse table format do you expect your organization will be using by the end of 2023?
This independence from a catalog allows for path based reads and writes. This is handy when writing from Kafka directly to Delta Lake for the first layer of ingestion. You don’t need a catalog (or even Spark). https://github.com/delta-io/kafka-delta-ingest/tree/main/src
-
Streaming Data and Postgres
As far as I know no. You certainly could use events on a streaming ledger like Kafka or Redpanda and then store to delta with https://github.com/delta-io/kafka-delta-ingest and process them with all the gis goodness of spark. However, this is fairly complicated and much different from a simple postgis drop in replacement. There are specialized meaning faster and more efficient systems out there for specialized tasks such as geo fencing in real-time
-
Rust is showing a lot of promise in the DataFrame / tabular data space
kafka-delta-ingest is a good project to get streaming data into a Delta Lake. Here's a great talk on the topic.
-
process millions of events per sec
What about https://github.com/delta-io/kafka-delta-ingest?
- Exactly once delivery from Kafka to Delta Lake with Rust
rust-rdkafka
-
Rust Cpp Interop via Cxx, Autocxx / any best practices out there
I use this library a lot and it's got some nice touches for how to handle wrapping a C library: https://github.com/fede1024/rust-rdkafka
-
Trace Through a Kafka Cluster with Rust and OpenTelemetry
For this example, we're using rdkafka to build producers and consumers, because it allows us to specify custom headers for each record.
-
A Rust client library for interacting with Microsoft Airsim https://github.com/Sollimann/airsim-client
kafka
-
is there any other alternative for hadoop ecosystem that runs on rust?
You might find https://crates.io/crates/rdkafka helpful
-
Hey Rustaceans! Got an easy question? Ask here (46/2021)!
I am playing with tokio and rust-rdkafka library, following the examples like this one: https://github.com/fede1024/rust-rdkafka/blob/6fb2c37/examples/asynchronous_processing.rs
-
confluent Schema Registry and Rust
The source for the current version of the library can be found on Github. I had to increase the major version because I needed to break the API in order to support all formats supported by the current Schema Registry version. I also added the possibility to set an API key, so it can be used with Confluent Cloud, the cloud offering from Confluent. As part of the latest major refactoring it's also supporting async. This might improve performance of your app, and is also the default for the major Kafka client, more information about why you would want to use async can be found in the async book. The schemas retrieved from the Schema Registry are cached. This way the schema is only retrieved once for each id, and reused for other messages with the same id.
-
Is there an alternative to Kafka that has better support in Rust?
What's wrong with rust-rdkafka?
-
Getting started with Kafka and Rust: Part 2
This is a two-part series to help you get started with Rust and Kafka. We will be using the rust-rdkafka crate which itself is based on librdkafka (C library).
What are some alternatives?
delta-rs - A native Rust library for Delta Lake, with bindings into Python
Kafka Rust Client - Rust client for Apache Kafka [Moved to: https://github.com/kafka-rust/kafka-rust]
dipa - dipa makes it easy to efficiently delta encode large Rust data structures.
schema-registry - Confluent Schema Registry for Kafka
kafka-rust - Rust client for Apache Kafka
kafka-go - Kafka library in Go
flowgger - A fast data collector in Rust
franz-go - franz-go contains a feature complete, pure Go library for interacting with Kafka from 0.8.0 through 3.6+. Producing, consuming, transacting, administrating, etc.
arrow2 - Transmute-free Rust library to work with the Arrow format
arewegameyet - The repository for https://arewegameyet.rs
delta - A syntax-highlighting pager for git, diff, and grep output