Rust is showing a lot of promise in the DataFrame / tabular data space

This page summarizes the projects mentioned and recommended in the original post on /r/rust

Stream - Scalable APIs for Chat, Feeds, Moderation, & Video.
Stream helps developers build engaging apps that scale to millions with performant and flexible Chat, Feeds, Moderation, and Video APIs and SDKs powered by a global edge network and enterprise-grade infrastructure.
getstream.io
featured
InfluxDB – Built for High-Performance Time Series Workloads
InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
www.influxdata.com
featured
  1. polars

    Dataframes powered by a multithreaded, vectorized query engine, written in Rust

    [Polars](https://github.com/pola-rs/polars) is a blazing fast DataFrame library with a beautiful user interface and an awesome getting started guide. The impressive h2o benchmark results have gotten Polars a lot of users.

  2. Stream

    Stream - Scalable APIs for Chat, Feeds, Moderation, & Video. Stream helps developers build engaging apps that scale to millions with performant and flexible Chat, Feeds, Moderation, and Video APIs and SDKs powered by a global edge network and enterprise-grade infrastructure.

    Stream logo
  3. db-benchmark

    reproducible benchmark of database-like ops

    [Polars](https://github.com/pola-rs/polars) is a blazing fast DataFrame library with a beautiful user interface and an awesome getting started guide. The impressive h2o benchmark results have gotten Polars a lot of users.

  4. datafusion

    Apache DataFusion SQL Query Engine

    [arrow-datafusion](https://github.com/apache/arrow-datafusion) is another great DataFrame library, especially if you like running SQL queries. It's so easy to query a Parquet / CSV dataset with SQL using DataFusion. I've run local benchmarks and it's super fast. The DataFusion docs are a bit lacking, which is a shame, for such a developed and amazing library. I hope to make these better and help spread the world about how truly amazing this lib is.

  5. arrow2

    Discontinued Transmute-free Rust library to work with the Arrow format

    [arrow2](https://github.com/jorgecarleitao/arrow2) and [parquet2](https://github.com/jorgecarleitao/parquet2) are great foundational libraries for and DataFrame libs in Rust.

  6. parquet2

    Fastest and safest Rust implementation of parquet. `unsafe` free. Integration-tested against pyarrow

    [arrow2](https://github.com/jorgecarleitao/arrow2) and [parquet2](https://github.com/jorgecarleitao/parquet2) are great foundational libraries for and DataFrame libs in Rust.

  7. delta-rs

    A native Rust library for Delta Lake, with bindings into Python

    I'm working on [delta-rs](https://github.com/delta-io/delta-rs) which brings the power of Delta Lake to the Rust community. CSV / Parquet lakes are limited and Delta Lakes offer a ton of advantages (versioned data, time travel, ACID transactions, schema enforcement, etc). We're working to bring full Polars and DataFusion support to delta-rs, see the roadmap.

  8. influxdb_iox

    Discontinued Pronounced (influxdb eye-ox), short for iron oxide. This is the new core of InfluxDB written in Rust on top of Apache Arrow.

    Already is: https://github.com/influxdata/influxdb_iox Just still a work in progress.

  9. InfluxDB

    InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.

    InfluxDB logo
  10. PyO3

    Rust bindings for the Python interpreter

    If you’re interested in python bindings take a look at https://pyo3.rs/

  11. kafka-delta-ingest

    A highly efficient daemon for streaming data from Kafka into Delta Lake

    kafka-delta-ingest is a good project to get streaming data into a Delta Lake. Here's a great talk on the topic.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

Did you know that Rust is
the 5th most popular programming language
based on number of references?