Does anyone else feel in a tricky spot about their use of R?

Our great sponsors

InfluxDB - Power Real-Time Data Analytics at Scale

WorkOS - The modern identity platform for B2B SaaS

SaaSHub - Software Alternatives and Reviews

Our great sponsors

db-benchmark

91 319 0.0 R

reproducible benchmark of database-like ops

Performance efficiency and capacity (e.g. RAM and speed), from the stats coder perspective, is not dependent on the language, but it's dependent on the packages. As /u/Farther_father mentioned, tidytable is identical to dplyr from coding perspective, but the efficiency and capacity are far better. This means that what you said about R's design or S4, Python, Julia, etc. is a fundamental misunderstanding of what is going on in the back-end, especially because Julia is known to be performant, when in fact it is the worst of the three (pandas runs out of memory while polars/tidypolars does not, dplyr runs out of memory while data.table/tidytable does not, etc. -- same language, different packages, different performance).

targets

10 866 9.7 R

Function-oriented Make-like declarative workflows for R

I'll chime in with others to say that using targets can help with the memory load as well. If you partition your data adequately (e.g. grouping by subjects), you can take advantage of the way targets maps data so it only loads what it needs to. Moreover, if you use the memory = "transient" option, it will unload objects between steps -- adding a little bit of time overhead but saving you on memory. targets and tidytable together have enabled me to work on pretty sizeable datasets while rarely running into memory issues. In fact, the only time I ran into a data memory hog was because I didn't adequately partition the data across worker nodes.

InfluxDB

www.influxdata.com sponsored

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project