Rust Arrow

Open-source Rust projects categorized as Arrow

Top 17 Rust Arrow Projects

  • polars

    Dataframes powered by a multithreaded, vectorized query engine, written in Rust

    Project mention: Using Polars in Rust for high-performance data analysis | dev.to | 2024-10-30

    If you want to get into Polars, the library is very well documented, and I’d recommend you check out their getting started tutorial, their API docs, and when you’re all set up, you can also check out their Cookbooks to learn about many of the standard operations within Polars.

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
  • datafusion

    Apache DataFusion SQL Query Engine

    Project mention: Apache DataFusion | news.ycombinator.com | 2025-01-12
  • roapi

    Create full-fledged APIs for slowly moving datasets without writing a single line of code.

    Project mention: Show HN: Turn CSS files into high performance APIs | news.ycombinator.com | 2025-01-11
  • datafusion-ballista

    Apache DataFusion Ballista Distributed Query Engine

    Project mention: DataFusion Comet: Apache Spark Accelerator | news.ycombinator.com | 2024-05-31

    But why. Just ditch Spark and use https://github.com/apache/datafusion-ballista directly.

  • vortex

    An extensible, state-of-the-art columnar file format (by spiraldb)

    Project mention: Show HN: Vortex – a high-performance columnar file format in Rust | news.ycombinator.com | 2024-10-14

    We have a TableProvider for use with Datafusion, checkout this crate and its examples: https://github.com/spiraldb/vortex/tree/develop/vortex-dataf...

  • datafusion-comet

    Apache DataFusion Comet Spark Accelerator

    Project mention: Amazon's Exabyte-Scale Migration from Apache Spark to Ray on Amazon EC2 | news.ycombinator.com | 2024-07-29

    I wonder if similar performance can be achieved with Spark accelerator like https://github.com/apache/datafusion-comet. Of course it didn’t exist before

  • tonbo

    A portable embedded database using Arrow.

    Project mention: Show HN: TonboLite – Scale SQLite with S3, Minimize ETL | news.ycombinator.com | 2025-01-07

    Hi! I am Tzu and the team from Tonbo here.

    TonboLite: https://github.com/tonbo-io/tonbolite is a SQLite extension based on Tonbo: https://github.com/tonbo-io/tonbo. It enables SQLite to create tables suitable for analytical processing on target platforms like WebAssembly in browser and efficiently write data. The data in the tables is organized as tiered Apache Parquet format files, stored on demand either on local disks (using OPFS as native I/O) or object storage services (such as S3). You can use it by creating virtual tables in regular SQLite.

    TonboLite started with the exploration of Tonbo application. The goal of Tonbo is to write data for analytical processing (like log processing, metrics monitoring, or text search) to unlimited remote storage in SQLite and PostgreSQL.

    We tried SQLite as it is the most popular transactional database for the edge. One of the most requested improvements for SQLite is better support for append-only writes (e.g., logs, time-series data), which are common in analytical data. Append-only writes present two main challenges for SQLite:

  • sail

    LakeSail's computation framework with a mission to unify stream processing, batch processing, and compute-intensive (AI) workloads. (by lakehq)

    Project mention: AI and All Data Weekly - 02 December 2024 | dev.to | 2024-12-02
  • duckdb-rs

    Ergonomic bindings to duckdb for Rust

  • parquet-wasm

    Rust-based WebAssembly bindings to read and write Apache Parquet data

    Project mention: FLaNK AI Weekly for 29 April 2024 | dev.to | 2024-04-29
  • pqrs

    Command line tool for inspecting Parquet files

  • biobear

    Work with bioinformatic files using Arrow, Polars, and/or DuckDB

  • iceberg-rust

    Rust implementation of Apache Iceberg with integration for Datafusion (by JanKaul)

  • fastexcel

    A Python wrapper around calamine (by ToucanToco)

  • datafusion-dft

    An opinionated and batteries included DataFusion implementation.

    Project mention: An Opinionated DataFusion CLI and TUI | news.ycombinator.com | 2024-12-06
  • s2protocol-rs

    Starcraft 2 Protocol Replay Reader

  • myval

    Lightweight Apache Arrow data frame for Rust

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Rust Arrow discussion

Log in or Post with

Rust Arrow related posts

Index

What are some of the best open-source Arrow projects in Rust? This list will help you:

Project Stars
1 polars 31,328
2 datafusion 6,559
3 roapi 3,249
4 datafusion-ballista 1,608
5 vortex 1,065
6 datafusion-comet 865
7 tonbo 854
8 sail 598
9 duckdb-rs 539
10 parquet-wasm 533
11 pqrs 301
12 biobear 166
13 iceberg-rust 132
14 fastexcel 129
15 datafusion-dft 126
16 s2protocol-rs 103
17 myval 63

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com

Did you konow that Rust is
the 5th most popular programming language
based on number of metions?