Is there a way to load a large SAS7BDAT dataset into R efficiently with fair speed?

This page summarizes the projects mentioned and recommended in the original post on /r/rstats

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
  • vroom

    Fast reading of delimited files (by tidyverse)

  • You can use an `rds` file. You have to read it in then write it out though. If you care about speed, then just use `readr::write_rds`, which is similar to the base `saveRDS`, but with compression off, but the file size will be much larger. You can also use random access objects, such as `fst`: https://www.fstpackage.org/, but again, need to write it out. I tried a quick benchmark and `haven` is much faster than `sas7bdat` package. If it's in a plain text delimited file, you can also look into `vroom`: https://github.com/r-lib/vroom

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts