Our great sponsors
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
The Debian Linux based rocker/r-base Docker image from the Rocker project is considered bleeding edge when it comes to system dependencies, i.e. latest development versions are usually available sooner than on other Linux distributions.
The two Ubuntu Linux based images, rocker/r-ubuntu and rstudio/r-base from the Rocker project and from RStudio are for long-term support Ubuntu versions and use the RSPM CRAN binaries.
The base image is so bare-bones that it needs to install time zones, fonts and the Cairo device for ggplot2 to work (read the limitations here). Instead of apt you have apk and might have to work a bit harder to find all the Alpine-specific dependencies.
Docker versions 18.09 or higher come with a new opt-in builder backend called BuildKit. BuildKit prints out a nice summary of each layer including timing for the layers and the overall build. This is the general build command that I used to compare the four parent images:
There are at least two databases listing package requirements: one maintained by RStudio (this supports RSPM), another one by R-hub. Both of these list system packages for various Linux distributions, macOS, and Windows. But even with these databases, the build- vs. run-time dependencies can be sometimes hard to distinguish. Build-time system libraries are always named with a -dev or -devel postfix. Read the vignette of the maketools R package by Jeroen Ooms for a nice explanation and a suggested workflow for determining run-time dependencies of packages.
There are at least two databases listing package requirements: one maintained by RStudio (this supports RSPM), another one by R-hub. Both of these list system packages for various Linux distributions, macOS, and Windows. But even with these databases, the build- vs. run-time dependencies can be sometimes hard to distinguish. Build-time system libraries are always named with a -dev or -devel postfix. Read the vignette of the maketools R package by Jeroen Ooms for a nice explanation and a suggested workflow for determining run-time dependencies of packages.
Caching can be useful is when installing R package dependencies. In a previous post, we looked at how to use the renv package to install dependencies. Here is a simplified snippet from that Dockerfile:
Best practices for writing Dockerfiles are being followed more and more often according to this paper after mining more than 10 million Dockerfiles on Docker Hub and GitHub. However, there is still room for improvement. This is where linters come in as useful tools for static code analysis. Hadolint lists lots of rules for Dockerfiles and is available as a VS Code extension.