data.table
siuba
Our great sponsors
data.table | siuba | |
---|---|---|
16 | 25 | |
3,478 | 1,100 | |
0.8% | - | |
9.6 | 7.5 | |
2 days ago | 7 months ago | |
R | Python | |
Mozilla Public License 2.0 | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
data.table
- Data.table: R's data.table package extends data.frame
-
Discovering Copy-on-Write in R
The data.table package may also make a huge difference in performance, and often simplifies the code as well https://github.com/Rdatatable/data.table
- new governance being proposed for data.table
-
Local development environment for the data.table R project
After the partial success with the development environment for R-yaml we tried another R package called data.table as part of the Open Source Development Course. Eventually we managed to run the tests of this too.
-
Alternative to Pandas
There's datatable. I haven't used it much, but the R version (data.table) is phenomenal.
-
Do python packages have long form documentation? If so can someone provide me a sample?
data.table README.md
-
How to move βtimeβ to a new column
That's an old bug in data.table v1.12.2. It's been fixed for a while now. If you update your data.table version (e.g., install.packages("data.table") ) and retry then it should work fine.
-
Hiring an R coder to improve efficiency of code?
Some suggestions: (1) https://github.com/Rdatatable/data.table Code based on the data.table will probably be fastest. There are a number of reasons for this. More here: https://cran.r-project.org/web/packages/data.table/vignettes/ and here: https://rdatatable.gitlab.io/data.table/library/data.table/html/datatable-optimize.html The GForce set of optimizations is well explained here: https://www.brodieg.com/2019/02/24/a-strategy-for-faster-group-statisitics/ (2) setDTthreads() is your friend in data.table (3) I have found (on Windows at least) Microsoft Open R use of parallel MKL faster than CRAN's latest release. See https://mran.microsoft.com/documents/rro/multithread Microsoft recommends using setMKLthreads() if it will help. (4) I think rfast ( https://github.com/RfastOfficial/Rfast ) is a library worth considering although I don't know if it will help you with brms and stan operations.
-
Piping in R is like baking!
Take a look at the 22nd new feature of v1.14.3 on development here.
- memory leak after data.table::fread()?
siuba
- The Design Philosophy of Great Tables (Software Package)
-
Best alternative to Pandas 2023?
I don't know what's best for you, but I can recommend Siuba, a tidy interface for Python to send queries to pandas and SQL-db.
- Method Chaining in Pandas: Bad Form or a Recipe for Success?
-
Happy Halloween, Pandas! ππ€
You mean siuba?
-
Explorer (Elixir and Polars)
For further inspiration, this is a pretty good-looking "dplyr for Python": https://github.com/machow/siuba
- Unpopular opinion: Matplotlib is a bad library
- A trick to have arbitrary infix operators in Python
-
Going from R to Pandas: dplython vs dfply vs plydata
You should follow /u/the75th's advice. However, if you decide to buck that take, I'd look into siuba. I've never heard of those packages you've listed, and have doubts they'd be maintained.
- Tidyverse equivalent in Python?
-
R / Tidyverse User -> Python | How to Make it Hurt Less
Check out siuba
What are some alternatives?
vaex - Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of big tabular data at a billion rows per second π
Pandas - Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
rust-dataframe - A Rust DataFrame implementation, built on Apache Arrow
dtale - Visualizer for pandas data structures
TypedTables.jl - Simple, fast, column-based storage for data analysis in Julia
Altair - Declarative statistical visualization library for Python
gsir-te - Getting Started in R -- Tinyverse Edition
q - q - Run SQL directly on delimited files and multi-file sqlite databases
ballista - Distributed compute platform implemented in Rust, and powered by Apache Arrow.
vinum - Vinum is a SQL processor for Python, designed for data analysis workflows and in-memory analytics.
db-benchmark - reproducible benchmark of database-like ops
DataFramesMeta.jl - Metaprogramming tools for DataFrames