gsir-te
rust-dataframe
gsir-te | rust-dataframe | |
---|---|---|
1 | 1 | |
230 | 287 | |
- | - | |
0.0 | 0.8 | |
about 5 years ago | over 3 years ago | |
R | Rust | |
GNU General Public License v3.0 only | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
gsir-te
-
I wrote one of the fastest DataFrame libraries
I dropped dplyr in favor of data.table and never looked back.
https://github.com/eddelbuettel/gsir-te
rust-dataframe
-
I wrote one of the fastest DataFrame libraries
>Rust DataFrame implementation, built on Apache Arrow
https://github.com/nevi-me/rust-dataframe
A bit less mature/feature-complete than polars last time I looked. Does not seem to do anything with on-disk spillover from what I can see. But if you wanted to use Arrow to do that, nevi-me's crate may be a good place to start.
What are some alternatives?
data.table - R's data.table package extends data.frame:
vaex - Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of big tabular data at a billion rows per second 🚀
polars - Dataframes powered by a multithreaded, vectorized query engine, written in Rust
TypedTables.jl - Simple, fast, column-based storage for data analysis in Julia
lance - Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, with more integrations coming..
ballista - Distributed compute platform implemented in Rust, and powered by Apache Arrow.