Hiring an R coder to improve efficiency of code?

Our great sponsors

WorkOS - The modern identity platform for B2B SaaS

InfluxDB - Power Real-Time Data Analytics at Scale

SaaSHub - Software Alternatives and Reviews

Our great sponsors

db-benchmark

91 319 0.0 R

reproducible benchmark of database-like ops

base-R is not particularly fast. Use data.table and it's fast assignment/grouping/aggregation

data.table

16 3,478 9.4 R

R's data.table package extends data.frame:

Some suggestions: (1) https://github.com/Rdatatable/data.table Code based on the data.table will probably be fastest. There are a number of reasons for this. More here: https://cran.r-project.org/web/packages/data.table/vignettes/ and here: https://rdatatable.gitlab.io/data.table/library/data.table/html/datatable-optimize.html The GForce set of optimizations is well explained here: https://www.brodieg.com/2019/02/24/a-strategy-for-faster-group-statisitics/ (2) setDTthreads() is your friend in data.table (3) I have found (on Windows at least) Microsoft Open R use of parallel MKL faster than CRAN's latest release. See https://mran.microsoft.com/documents/rro/multithread Microsoft recommends using setMKLthreads() if it will help. (4) I think rfast ( https://github.com/RfastOfficial/Rfast ) is a library worth considering although I don't know if it will help you with brms and stan operations.

WorkOS

workos.com sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
Rfast

1 134 3.6 C++

A collection of Rfast functions for data analysis. Note 1: The vast majority of the functions accept matrices only, not data.frames. Note 2: Do not have matrices or vectors with have missing data (i.e NAs). We do no check about them and C++ internally transforms them into zeros (0), so you may get wrong results. Note 3: In general, make sure you give the correct input, in order to get the correct output. We do no checks and this is one of the many reasons we are fast.

Some suggestions: (1) https://github.com/Rdatatable/data.table Code based on the data.table will probably be fastest. There are a number of reasons for this. More here: https://cran.r-project.org/web/packages/data.table/vignettes/ and here: https://rdatatable.gitlab.io/data.table/library/data.table/html/datatable-optimize.html The GForce set of optimizations is well explained here: https://www.brodieg.com/2019/02/24/a-strategy-for-faster-group-statisitics/ (2) setDTthreads() is your friend in data.table (3) I have found (on Windows at least) Microsoft Open R use of parallel MKL faster than CRAN's latest release. See https://mran.microsoft.com/documents/rro/multithread Microsoft recommends using setMKLthreads() if it will help. (4) I think rfast ( https://github.com/RfastOfficial/Rfast ) is a library worth considering although I don't know if it will help you with brms and stan operations.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

How to generate a great website and reference manual for your R package
1 project | dev.to | 10 Apr 2024
Array Languages: R vs. APL
1 project | news.ycombinator.com | 21 Mar 2024
Data.table: R's data.table package extends data.frame
1 project | news.ycombinator.com | 15 Mar 2024
Database-Like Ops Benchmark
1 project | news.ycombinator.com | 9 Mar 2024
Fable: Forecasting Models for Tidy Time Series
1 project | news.ycombinator.com | 3 Mar 2024

Hiring an R coder to improve efficiency of code?

This page summarizes the projects mentioned and recommended in the original post on /r/rstats Post date: 14 Sep 2022

db-benchmark

data.table

WorkOS

Rfast

Related posts