  • GitHub repo disk.frame

    Fast Disk-Based Parallelized Data Manipulation Framework for Larger-than-RAM Data

    Project mention: Data cleaning/ analysis 100-200 million rows of data. Is this doable in R, or is there another program I should try instead? | reddit.com/r/rstats | 2021-10-12

    It depends on your hardware, but it should not be a problem. You might look into disk frame (https://diskframe.com) or similar packages.

  • GitHub repo police-settlements

    A FiveThirtyEight/The Marshall Project effort to collect comprehensive data on police misconduct settlements from 2010-19.

    Project mention: Police misconduct settlements | reddit.com/r/Ph03niX | 2021-02-28
  • Nanos

    Run Linux Software Faster and Safer than Linux with Unikernels.

  • GitHub repo opentripplanner

    An R package to set up and use OpenTripPlanner (OTP) as a local or remote multimodal trip planner. (by ropensci)

    Project mention: R packages for transit planning? | reddit.com/r/rstats | 2021-01-03

    Transportation planner / data scientist here: The R opentripplanner package (https://github.com/ropensci/opentripplanner) (also a Robin Lovelace-related package!) is a particular favorite, just wanted to call that out! Also, the Open Transit Data Toolkit (https://transitdatatoolkit.com/) might give some ideas on topics to cover. I think the methods are a bit dated at this point (i.e. not a lot of tidyverse, sf) but in general it's a great resource.

  • GitHub repo covid19-nor-data

    Cleaned public data about Covid-19 in Norway

    Project mention: Oc Timeline Of Relative Spread Of Covid19 With | reddit.com/r/dataisbeautiful | 2021-02-26

    Norway: covid19-nor-data/municipality_and_district_wide.csv at master · thohan88/covid19-nor-data (github.com) also I think the deaths figures are somewhat derived from https://www.dagsavisen.no/nyheter/innenriks/oversikt-disse-er-dode-etter-koronasmitte-i-norge-1.1691241 at least in the beginning.

