5 Reproducibility of Results

Full reproducibility of all results generated at a certain point in time is critical for any type of analysis work but especially within the pharma industry. IQR Tools was designed in order to enable a user to be sure an analysis can be repeated years later and still give the same results.

5.1 The CRAN Nightmare

CRAN is a fantastic place and it brings R to us!

However, it essentially forces a user to always use the newest versions of packages. Packages on CRAN do change on a constant basis and no one ensures that the new packages work with packages that already are installed on your computer. System administrators in companies, having to handle R installations, do know that pain. The normal user is seldom confronted with that - except in the case that he or she has to reproduce a specific result that was generated on an old R installation - or simply on a colleagues computer.

The use of CRAN should be avoided at all costs in a corporate environment and should be limited to personal or experimental installations.

As an anecdote: It happened that one computer was installed with R and a set of R packages. Tests were run. All worked. Walking 6 meters to the next computer and repeating the same procedure. Tests were run. It did not work anymore. Digging down into why, it appeared that during the walk to the other computer two packages on CRAN had changed. A real nightmare for reproducibility!

5.2 MRAN Time Machine

An alternative to CRAN is MRAN, the Microsoft R Application Network. MRAN takes snapshots of CRAN at given moments and about 2 months after the availability of a new R version freezes the snapshot of CRAN for this particular R version. This means that for a given R version a snapshot of CRAN R packages is available that:

  • Will never change
  • Is accessible via convenient dated links

An example for such a dated link for the R 3.4.1 version is: https://cran.microsoft.com/snapshot/2017-09-01/, on which CRAN content was frozen on the 1st of September 2017. Packages from this dated repository can be installed in R in the same manner as installation from CRAN, but the repos argument needs to be provided:

install.packages("ggplot2",repos="https://cran.microsoft.com/snapshot/2017-09-01/")

5.3 IQR Tools Installer

The installation utility of IQR Tools will install all dependencies (and their dependencies, and so on) from the Microsoft Time Machine repository that fits the R version on which it is installed.

For this reason it is recommended that at least for the initial installation of IQR Tools the forceDependencies argument is set to TRUE. Ideally, IQR Tools is installed on a clean R installation to ensure that all installed R packages are installed from the same dated MRAN repository.