kafka-delta-ingest
A highly efficient daemon for streaming data from Kafka into Delta Lake (by delta-io)
dipa
dipa makes it easy to efficiently delta encode large Rust data structures. (by chinedufn)
Our great sponsors
kafka-delta-ingest | dipa | |
---|---|---|
6 | 10 | |
319 | 253 | |
4.4% | - | |
7.5 | 0.0 | |
8 days ago | about 2 years ago | |
Rust | Rust | |
Apache License 2.0 | Apache License 2.0 |
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
kafka-delta-ingest
Posts with mentions or reviews of kafka-delta-ingest.
We have used some of these posts to build our list of alternatives
and similar projects. The last one was on 2023-06-26.
-
Using rust for DE activities?
Rust can offer incredible cost savings when you can use it in place of spark to interact with your delta lake. One such project was kafka-delta-ingest. The developers were able to reduce the cost of running the pipeline by over 90%. However, most of this stuff is still very experimental and not ready for production but you will definitely be seeing more projects like this just based on how much money can be saved.
-
Which lakehouse table format do you expect your organization will be using by the end of 2023?
This independence from a catalog allows for path based reads and writes. This is handy when writing from Kafka directly to Delta Lake for the first layer of ingestion. You don’t need a catalog (or even Spark). https://github.com/delta-io/kafka-delta-ingest/tree/main/src
-
Streaming Data and Postgres
As far as I know no. You certainly could use events on a streaming ledger like Kafka or Redpanda and then store to delta with https://github.com/delta-io/kafka-delta-ingest and process them with all the gis goodness of spark. However, this is fairly complicated and much different from a simple postgis drop in replacement. There are specialized meaning faster and more efficient systems out there for specialized tasks such as geo fencing in real-time
-
Rust is showing a lot of promise in the DataFrame / tabular data space
kafka-delta-ingest is a good project to get streaming data into a Delta Lake. Here's a great talk on the topic.
-
process millions of events per sec
What about https://github.com/delta-io/kafka-delta-ingest?
- Exactly once delivery from Kafka to Delta Lake with Rust
dipa
Posts with mentions or reviews of dipa.
We have used some of these posts to build our list of alternatives
and similar projects. The last one was on 2023-04-24.
-
What's everyone working on this week (17/2023)?
Have you seen https://github.com/chinedufn/dipa or https://docs.rs/serde-diff? I haven’t used either yet but they sound similar.
-
"git diff"-like rust lib to find and apply changes to files?
You could use something like this: https://github.com/chinedufn/dipa
- Dipa – space-optimized diffing of Rust data structures
-
Complex Rust Apps which Integrate An Undo/Redo System
For diffing, I used json patch, which allows me to have a history of changes in an easy to serialize manner. There are other more efficient and space saving diffing libraries, such as this new one: https://github.com/chinedufn/dipa, however I found json patch is supported in a number of languages, easy to read and store in a db.
- Show HN: Dipa generates optimized code for diffing and patching Rust structs
- Dipa – reduce network traffic in Rust apps by only sending state diffs to users
-
dipa - a framework for efficiently delta encoding large Rust data structures
So I started working on dipa in 2019, took over a year and a half away from it and then came back and finished it over the last few weeks.
- Show HN: Dipa – a framework for efficiently delta encoding Rust data structures
What are some alternatives?
When comparing kafka-delta-ingest and dipa you can also consider the following projects:
delta-rs - A native Rust library for Delta Lake, with bindings into Python
kafka-rust - Rust client for Apache Kafka
gdext - Rust bindings for Godot 4
rust-rdkafka - A fully asynchronous, futures-based Kafka client library for Rust based on librdkafka
socketioxide - A socket.io server implementation in Rust that integrates with the Tower ecosystem and the Tokio stack.
flowgger - A fast data collector in Rust
miniboosts - A collection of boosting algorithms written in Rust 🦀
arrow2 - Transmute-free Rust library to work with the Arrow format
mq
delta - A syntax-highlighting pager for git, diff, and grep output
freya-editor - Experimental code editor made with Freya 🦀