kafka-delta-ingest
influxdb_iox
kafka-delta-ingest | influxdb_iox | |
---|---|---|
6 | 14 | |
325 | 1,803 | |
4.0% | - | |
7.4 | 9.9 | |
18 days ago | 8 months ago | |
Rust | Rust | |
Apache License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
kafka-delta-ingest
-
Using rust for DE activities?
Rust can offer incredible cost savings when you can use it in place of spark to interact with your delta lake. One such project was kafka-delta-ingest. The developers were able to reduce the cost of running the pipeline by over 90%. However, most of this stuff is still very experimental and not ready for production but you will definitely be seeing more projects like this just based on how much money can be saved.
-
Which lakehouse table format do you expect your organization will be using by the end of 2023?
This independence from a catalog allows for path based reads and writes. This is handy when writing from Kafka directly to Delta Lake for the first layer of ingestion. You donโt need a catalog (or even Spark). https://github.com/delta-io/kafka-delta-ingest/tree/main/src
-
Streaming Data and Postgres
As far as I know no. You certainly could use events on a streaming ledger like Kafka or Redpanda and then store to delta with https://github.com/delta-io/kafka-delta-ingest and process them with all the gis goodness of spark. However, this is fairly complicated and much different from a simple postgis drop in replacement. There are specialized meaning faster and more efficient systems out there for specialized tasks such as geo fencing in real-time
-
Rust is showing a lot of promise in the DataFrame / tabular data space
kafka-delta-ingest is a good project to get streaming data into a Delta Lake. Here's a great talk on the topic.
-
process millions of events per sec
What about https://github.com/delta-io/kafka-delta-ingest?
- Exactly once delivery from Kafka to Delta Lake with Rust
influxdb_iox
-
InfluxDB 3.0 Infinite Observability with qryn-iox
Watch out for the AGPL minio <https://github.com/metrico/iox-community/blob/155a14bb5e8e32...> the almost certainly AGPL grafana <https://github.com/grafana/grafana/blob/v10.1.1/LICENSE> and always eye anyone who uses :latest images with healthy suspicion
That said, influx_iox itself appears to be Apache 2 (and/or MIT?) https://github.com/influxdata/influxdb_iox/blob/main/LICENSE...
-
InfluxDB 3 is out, OSS commits have been tried up - is this the end?
have you looked at https://github.com/influxdata/influxdb_iox ? that's where the development for the new version is done.
-
InfluxData releases InfluxDB 3.0 product suite for time series analytics
As I understand, InfluxDB 3 is just a re-branding of InfluxDB IOx. Then its' performance can be not very good comparing to Prometheus-like systems.
-
Production grade databases in Rust
InfluxDB iox
- Anyone had a success story of replacing C++ with Go?
-
InfluxDB announces their new storage engine written in Rust
Don't know how much is open or closed, but they were doing some development in the open: https://github.com/influxdata/influxdb_iox
-
Welcome to InfluxDB IOx: InfluxDataโs New Storage Engine
Just want to say congratulations to the team!
2 years and 9,500+ commits is a hell of a feat.
https://github.com/influxdata/influxdb_iox
-
Rust is showing a lot of promise in the DataFrame / tabular data space
Already is: https://github.com/influxdata/influxdb_iox Just still a work in progress.
-
Anyone using RDS IAM authentication in their app?
It looks like this crate is the workaround for that. But there's a PR on SQLX opened a couple days ago that will fix the issue.
-
Rust and what it needs to gain space in computation-oriented applications
You should check out polars, datafusion, influxdb iox and databend, all written in native Rust and powered by the Apache Arrow format. Polars in particular is pretty dam fast and has bindings for Python.
What are some alternatives?
delta-rs - A native Rust library for Delta Lake, with bindings into Python
databend - ๐๐ฎ๐๐ฎ, ๐๐ป๐ฎ๐น๐๐๐ถ๐ฐ๐ & ๐๐. Modern alternative to Snowflake. Cost-effective and simple for massive-scale analytics. https://databend.com
dipa - dipa makes it easy to efficiently delta encode large Rust data structures.
datafusion - Apache DataFusion SQL Query Engine
kafka-rust - Rust client for Apache Kafka
TimescaleDB - An open-source time-series SQL database optimized for fast ingest and complex queries. Packaged as a PostgreSQL extension.
rust-rdkafka - A fully asynchronous, futures-based Kafka client library for Rust based on librdkafka
polars - Dataframes powered by a multithreaded, vectorized query engine, written in Rust
flowgger - A fast data collector in Rust
db-benchmark - reproducible benchmark of database-like ops
arrow2 - Transmute-free Rust library to work with the Arrow format
orioledb - OrioleDB โ building a modern cloud-native storage engine (... and solving some PostgreSQL wicked problems) ย ๐บ๐ฆ