Our great sponsors
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
parquet2
Fastest and safest Rust implementation of parquet. `unsafe` free. Integration-tested against pyarrow
-
influxdb_iox
Discontinued Pronounced (influxdb eye-ox), short for iron oxide. This is the new core of InfluxDB written in Rust on top of Apache Arrow.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
[Polars](https://github.com/pola-rs/polars) is a blazing fast DataFrame library with a beautiful user interface and an awesome getting started guide. The impressive h2o benchmark results have gotten Polars a lot of users.
[Polars](https://github.com/pola-rs/polars) is a blazing fast DataFrame library with a beautiful user interface and an awesome getting started guide. The impressive h2o benchmark results have gotten Polars a lot of users.
[arrow-datafusion](https://github.com/apache/arrow-datafusion) is another great DataFrame library, especially if you like running SQL queries. It's so easy to query a Parquet / CSV dataset with SQL using DataFusion. I've run local benchmarks and it's super fast. The DataFusion docs are a bit lacking, which is a shame, for such a developed and amazing library. I hope to make these better and help spread the world about how truly amazing this lib is.
[arrow2](https://github.com/jorgecarleitao/arrow2) and [parquet2](https://github.com/jorgecarleitao/parquet2) are great foundational libraries for and DataFrame libs in Rust.
[arrow2](https://github.com/jorgecarleitao/arrow2) and [parquet2](https://github.com/jorgecarleitao/parquet2) are great foundational libraries for and DataFrame libs in Rust.
I'm working on [delta-rs](https://github.com/delta-io/delta-rs) which brings the power of Delta Lake to the Rust community. CSV / Parquet lakes are limited and Delta Lakes offer a ton of advantages (versioned data, time travel, ACID transactions, schema enforcement, etc). We're working to bring full Polars and DataFusion support to delta-rs, see the roadmap.
Already is: https://github.com/influxdata/influxdb_iox Just still a work in progress.
If you’re interested in python bindings take a look at https://pyo3.rs/
kafka-delta-ingest is a good project to get streaming data into a Delta Lake. Here's a great talk on the topic.