R Data Science

Open-source R projects categorized as Data Science Edit details

Top 20 R Data Science Projects

  • awesome-R

    A curated list of awesome R packages, frameworks and software.

    Project mention: Python vs Matlab vs R | reddit.com/r/GradSchool | 2022-02-12
  • r4ds

    R for data science: a book

    Project mention: Need 8 million records in excel, ways to get around it | reddit.com/r/datascience | 2022-08-10

    Consult the free online version of the R for data science book for concise examples of how to work with the data table stored in the R object assigned to "df": https://r4ds.had.co.nz/

  • SonarLint

    Clean code begins in your IDE with SonarLint. Up your coding game and discover issues early. SonarLint is a free plugin that helps you find & fix bugs and security issues from the moment you start writing code. Install from your favorite IDE marketplace today.

  • DataScienceR

    a curated list of R tutorials for Data Science, NLP and Machine Learning

    Project mention: Python vs Matlab vs R | reddit.com/r/GradSchool | 2022-02-12
  • drake

    An R-focused pipeline toolkit for reproducibility and high-performance computing (by ropensci)

  • tidyverse

    Easily install and load packages from the tidyverse

  • janitor

    simple tools for data cleaning in R

    Project mention: R Libraries Every Data Scientist Should Know - Pyoflife | reddit.com/r/rprogramming | 2022-07-13

    I just stumbled across Janitor which can help you clean colum names easily.

  • engsoccerdata

    English and European soccer results 1871-2020

    Project mention: Updated 538 predictions after today's results | reddit.com/r/coys | 2022-04-16

    "The forecasts are based on a substantially revised version of ESPN’s Soccer Power Index (SPI), a rating system originally devised by FiveThirtyEight editor-in-chief Nate Silver in 2009 for rating international soccer teams. We have updated and adapted SPI to incorporate club soccer data going back to 1888 (from more than 550,000 matches in all) that we’ve collected from ESPN’s database and the Engsoccerdata GitHub repository, as well as from play-by-play data produced by Opta that has been available since 2010."

  • Scout APM

    Less time debugging, more time building. Scout APM allows you to find and fix performance issues with no hassle. Now with error monitoring and external services monitoring, Scout is a developer's best friend when it comes to application development.

  • targets

    Function-oriented Make-like declarative workflows for R

    Project mention: What are your favorite R Libraries? | reddit.com/r/rstats | 2022-08-01


  • disk.frame

    Fast Disk-Based Parallelized Data Manipulation Framework for Larger-than-RAM Data

    Project mention: Do you code from memory? Or do you reference things? | reddit.com/r/rstats | 2022-03-31

    Say hello to disk.frame.

  • collapse

    Advanced and Fast Data Transformation in R (by SebKrantz)

    Project mention: Benchmarking for loops vs apply and others | reddit.com/r/rstats | 2022-05-01

    If you are looking for performance I would recommend to check the collapse package. The following line "collapse" = collapse::fsum(df_datatable$x, g=df_datatable$g) is around 2x faster than base::rowsum, and the dplyr style syntax doesn't add that much of an overhead "collapse dplyr" = df_datatable |> fgroup_by(g) |> fsum(x)

  • voice-gender

    Gender recognition by voice and speech analysis

  • R-Fundamentals

    D-Lab's 12 hour introduction to R Fundamentals. Learn how to create variables and functions, manipulate data frames, make visualizations, use control flow structures, and more, using R in RStudio.

    Project mention: R-Fundamentals: NEW Data - star count:112.0 | reddit.com/r/algoprojects | 2022-05-07
  • fastverse

    An Extensible Suite of High-Performance and Low-Dependency Packages for Statistical Computing and Data Manipulation in R

    Project mention: Vectorized function VS Loops | reddit.com/r/rstats | 2022-07-24

    I understand the sentiment and I'm not trying to convince you to start writing optimised code to save ~2ms. There's a ton of optimised tools that I don't use myself because the time benefit is immaterial for what I do.

  • tweetbotornot2

    🔍🐦🤖 Detect Twitter Bots!

  • targets-tutorial

    Short course on the targets R package

    Project mention: The new Drake ropensci targets: Function-oriented Make-like declarative workflows for R {R} | reddit.com/r/Sciatro | 2021-11-15
  • targets-minimal

    A minimal example data analysis project with the targets R package

    Project mention: Should you always rerun your entire script when reopening an R project? | reddit.com/r/rstats | 2022-05-03

    Will has a GitHub repo in his own profile that is pretty minimal: https://github.com/wlandau/targets-minimal

  • causalglm

    Interpretable and model-robust causal inference for heterogeneous treatment effects using generalized linear working models with targeted machine-learning

    Project mention: [Q] Sensitivity of (Causal) Inference to Nonlinear Functional Form | reddit.com/r/statistics | 2021-09-28

    Why not both? https://tlverse.org/causalglm/ (Will replace this with a more informative comment when I have free time later today)

  • R-Guide

    R Guide

    Project mention: Useful Tools and Programs list for R and RStudio | reddit.com/r/RStudio | 2022-03-21
  • COVID19Algeria

    This repository contains datasets about Coronavirus COVID-19 in Algeria with daily updates and virus evolution in the country by province, date, and other criteria that are lacking in official resources and may help researchers or doctors to analyse the disease and maintain a good state of it changes.

  • c3plot

    R Package for Interactive Plotting via C3.js with a Base Graphics-like Interface

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020). The latest post mention was on 2022-08-10.

R Data Science related posts


What are some of the best open-source Data Science projects in R? This list will help you:

Project Stars
1 awesome-R 5,120
2 r4ds 3,591
3 DataScienceR 1,812
4 drake 1,321
5 tidyverse 1,258
6 janitor 1,156
7 engsoccerdata 687
8 targets 615
9 disk.frame 577
10 collapse 384
11 voice-gender 275
12 R-Fundamentals 121
13 fastverse 114
14 tweetbotornot2 83
15 targets-tutorial 77
16 targets-minimal 50
17 causalglm 12
18 R-Guide 1
19 COVID19Algeria 1
20 c3plot 0
Find remote jobs at our new job board 99remotejobs.com. There are 3 new remote jobs listed recently.
Are you hiring? Post a new remote job listing for free.
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives