polars
evcxr
Our great sponsors
polars | evcxr | |
---|---|---|
52 | 35 | |
5,869 | 3,116 | |
14.6% | 3.9% | |
9.9 | 7.9 | |
3 days ago | 3 days ago | |
Rust | Rust | |
MIT License | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
polars
- Anda para aqui alguém a brincar com Rust (linguagem)?
-
Will pandas eventually become more intuitive?
polars might click better. Once you learn the expression API, you'll see that googling is much less needed as you can extrapolate that logic.
- Modern Python Performance Considerations
-
Ask HN: Have we screwed ourselves as software engineers?
The new data tools I've seen are complex under the hood, but offer elegant user experiences, giving the best of both worlds.
You referenced a 500 line Python script being refactored with Rust and make me think of the Polars project: https://github.com/pola-rs/polars
Polars uses Rust to make DataFrame operations lightning fast. But you don't need to use Rust to use Polars. Just use the Polars Python API and you have an elegant way to scale on a single machine and perform analyses way faster.
I'm working on Dask and our end goal is the same. We want to provide users with syntax they're familiar with to scale their analyses locally & to clusters in the cloud. We also want to provide flexibility so users can provide highly custom analyses. Highly custom analyses are complex by nature, so these aren't "easy codebases" by any means, but Dask Futures / Dask Delayed makes the distributed cluster multiprocessing part a lot easier.
Anyways, I've just seen the data industry moving towards better & better tools. Delta Lake abstracting all the complications of maintaining all the complications of plain vanilla Parquet lakes is another example of the amazing tooling. Now the analyses and models... those seem to be getting more complicated.
-
Modern Pandas (Part 2): Method Chaining
I'd recommend checking out polars as an alternative to pandas - https://github.com/pola-rs/polars
It has a rather different api, and is significantly faster. Highly recommend it.
-
Hi! We are Dr. Amanda Martin and JJ Brosnan, Developer and Python data scientist at Deephaven. Ask us anything about getting started in the data science industry, working with large data sets, and working with streaming data in Python.
Have you looked at Polars? It's a new dataframe library that has an api that makes a lot more sense than pandas, and on top of that is much, much faster.
-
Robyn - A Python web framework with a Rust runtime - crossed 200k installs on PyPi
Polars is almost at 500k downloads, this is great to see the Rust ecosystem connecting with Python.
- Polars - Fast multi-threaded DataFrame library in Rust | Python | Node.js
- Polars 0.20.0 release
-
How was the polars python API designed and written?
Are there any articles about how the authors of polars used pyo3 to expose a python API?
evcxr
-
Creating an Easy Mode for Rust
REPL... is something was at least tried; I don't know how good it is because I haven't tested it yet, but it sounds interesting and potentially helpful. https://github.com/google/evcxr
-
How to speed up the Rust compiler in April 2022
Tytytyty!!
I know this is not realistic, but the only thing rust is missing is faster compilation times. To be fair, it’s already super fast.
I was adding a repl[1] to create-react-app[2] (trying to achieve something like RoR’s console) and it took way too long for it to boot simply because the dependencies for something like a full scale web app take long to compile. Even with incremental compilation, that first boot time was too much.
-
Kids don't need tools for kids
There's also some ongoing work on Rust REPL, see https://github.com/google/evcxr - though it's still a bit of a hack. Might become even easier than Turbo Pascal itself, and comparable to home computer BASICs.
- evcxr: a Jupyter-based REPL for Rust
-
Rustc as a library?
For a couple examples, evcxr, which provides a rust repl and jupyter kernel, uses rust-analyzer's crates to provide features such as completions in its repl.
-
#[you_can::turn_off_the_borrow_checker]
I disagree. Rust is statically typed but we build good enough libraries to make working with matrices easy enough, the data scientists will rarely have to type them out themselves as they can be inferred. I was working with glam just recently and it's a very nice matrix library in which I didn't have to mess with the type system at all. There is also REPL to support rust https://github.com/google/evcxr that could be improved.
-
Options for Rust REPL that's easy to install
evcxr https://github.com/google/evcxr provides a Jupyter kernel, which is handy. But it's quite complex to install and set up if you don't already have Python installed.
- Tokio Console
-
My Ideal Rust Workflow
You do need custom support to do this in Rust, but it's in the works - see https://github.com/google/evcxr for what seems to be the most current effort.
-
Rust can be good for less experienced programmers
evcxr is the best one I've found. It's pretty good, but it is extremely slow I assume because it has to compile each line as you input it. Also I can't get it to load crates from my current project so you have to install them within the REPL.
What are some alternatives?
vaex - Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of big tabular data at a billion rows per second 🚀
arrow-datafusion - Apache Arrow DataFusion and Ballista query engines
Apache Arrow - Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing
DataFrames.jl - In-memory tabular data in Julia
db-benchmark - reproducible benchmark of database-like ops
vscode-jupyter - VS Code Jupyter extension
rust-csv - A CSV parser for Rust, with Serde support.
arrow2 - Unofficial transmute-free Rust library to work with the Arrow format
arrow-rs - Official Rust implementation of Apache Arrow
tidypolars - Tidy interface to polars
wgpu - Safe and portable GPU abstraction in Rust, implementing WebGPU API.
bincode - A binary encoder / decoder implementation in Rust.