Rust is showing a lot of promise in the DataFrame / tabular data space

This page summarizes the projects mentioned and recommended in the original post on /r/rust

SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  1. polars

    Extremely fast Query Engine for DataFrames, written in Rust

    [Polars](https://github.com/pola-rs/polars) is a blazing fast DataFrame library with a beautiful user interface and an awesome getting started guide. The impressive h2o benchmark results have gotten Polars a lot of users.

  2. SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
  3. db-benchmark

    reproducible benchmark of database-like ops

    [Polars](https://github.com/pola-rs/polars) is a blazing fast DataFrame library with a beautiful user interface and an awesome getting started guide. The impressive h2o benchmark results have gotten Polars a lot of users.

  4. datafusion

    Apache DataFusion SQL Query Engine

    [arrow-datafusion](https://github.com/apache/arrow-datafusion) is another great DataFrame library, especially if you like running SQL queries. It's so easy to query a Parquet / CSV dataset with SQL using DataFusion. I've run local benchmarks and it's super fast. The DataFusion docs are a bit lacking, which is a shame, for such a developed and amazing library. I hope to make these better and help spread the world about how truly amazing this lib is.

  5. arrow2

    Discontinued Transmute-free Rust library to work with the Arrow format

    [arrow2](https://github.com/jorgecarleitao/arrow2) and [parquet2](https://github.com/jorgecarleitao/parquet2) are great foundational libraries for and DataFrame libs in Rust.

  6. parquet2

    Fastest and safest Rust implementation of parquet. `unsafe` free. Integration-tested against pyarrow

    [arrow2](https://github.com/jorgecarleitao/arrow2) and [parquet2](https://github.com/jorgecarleitao/parquet2) are great foundational libraries for and DataFrame libs in Rust.

  7. delta-rs

    A native Rust library for Delta Lake, with bindings into Python

    I'm working on [delta-rs](https://github.com/delta-io/delta-rs) which brings the power of Delta Lake to the Rust community. CSV / Parquet lakes are limited and Delta Lakes offer a ton of advantages (versioned data, time travel, ACID transactions, schema enforcement, etc). We're working to bring full Polars and DataFusion support to delta-rs, see the roadmap.

  8. influxdb_iox

    Discontinued Pronounced (influxdb eye-ox), short for iron oxide. This is the new core of InfluxDB written in Rust on top of Apache Arrow.

    Already is: https://github.com/influxdata/influxdb_iox Just still a work in progress.

  9. PyO3

    Rust bindings for the Python interpreter

    If you’re interested in python bindings take a look at https://pyo3.rs/

  10. kafka-delta-ingest

    A highly efficient daemon for streaming data from Kafka into Delta Lake

    kafka-delta-ingest is a good project to get streaming data into a Delta Lake. Here's a great talk on the topic.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts