10 Using Packages

R is available as a series of modules called packages, a few of which were included when you initially installed R.

Packages can contain all sorts of objects, but generally they are sources of new functions, datasets, example scripts, and documentation.

Anyone can develop and submit a package to CRAN, the central repository. CRAN packages must meet certain benchmarks to be accepted and distributed.

CRAN packages vary considerably in style and the quality of their documentation, even after meeting the CRAN benchmarks.

There are two main steps to using a package:

  • installing the package on your computer (with install.packages())
  • telling R to use that package for objects (functions, data) (with library())

While you only need to install a package once, you need to tell R to use that package any time you start a new R session.

In the SSCC, you will find that there are many packages already installed for you. You can install or update packages yourself - these will automatically be installed in a folder on your U:/ drive.

10.1 What packages are already installed?

If you are working in RStudio you can see the installed packages in the Packages pane, tabbed in the lower right of RStudio with Files, Plots, and Help.

You can scroll through the list, or use the search box in the upper right of the pane. The search box works much like it does in help.

You can click on a package name to see a help page listing all of the functions and other objects in that package.

For example, suppose you were looking for documentation on a function to specify the number of cores you want to use for running R in parallel. If we already know it is in the parallel package, we could

  • Open the Packages pane
  • Search for or scroll down to parallel
  • Scroll through the list of functions to find makeCluster()
  • Click on the function name to read its help page

Alternatively, if we already knew the function name, we can search for a help page from the console with the pattern ?package::function. Note that we can omit the package name if it is already loaded. By default, parallel is not, so we can either load it first (library(parallel)) or find it by typing this into the console: ?parallel::makeCluster

Try it!

RStudio packages

10.2 Installing Additional Packages

You can install a package with the Install icon on the Packages toolbar. By default this installs packages from CRAN. If you have a package from another source in the form of a downloaded archive file, you can also install from that.

You can also install packages by using code. The following code installs the faraway package from CRAN:

install.packages("faraway")

The downloaded binary packages are in
    /var/folders/9_/7w6r9nvs0tsbslhtm7g_rhqm0000gq/T//RtmpHpn5FJ/downloaded_packages

10.3 Using a Package

To actually use the material in the package you must load it using the library() function. hsb is a dataset in the faraway package. Notice the difference in the output of summary(hsb) before and after loading faraway.

summary(hsb)
Error in eval(expr, envir, enclos): object 'hsb' not found
library(faraway)
summary(hsb)
       id            gender              race         ses         schtyp          prog          read           write      
 Min.   :  1.00   female:109   african-amer: 20   high  :58   private: 32   academic:105   Min.   :28.00   Min.   :31.00  
 1st Qu.: 50.75   male  : 91   asian       : 11   low   :47   public :168   general : 45   1st Qu.:44.00   1st Qu.:45.75  
 Median :100.50                hispanic    : 24   middle:95                 vocation: 50   Median :50.00   Median :54.00  
 Mean   :100.50                white       :145                                            Mean   :52.23   Mean   :52.77  
 3rd Qu.:150.25                                                                            3rd Qu.:60.00   3rd Qu.:60.00  
 Max.   :200.00                                                                            Max.   :76.00   Max.   :67.00  
      math          science          socst     
 Min.   :33.00   Min.   :26.00   Min.   :26.0  
 1st Qu.:45.00   1st Qu.:44.00   1st Qu.:46.0  
 Median :52.00   Median :53.00   Median :52.0  
 Mean   :52.65   Mean   :51.85   Mean   :52.4  
 3rd Qu.:59.00   3rd Qu.:58.00   3rd Qu.:61.0  
 Max.   :75.00   Max.   :74.00   Max.   :71.0  

10.4 Undoing things

You will rarely, if ever, need to unload or uninstall packages, but we can do these operations with the detach() and remove.packages() functions.

detach() is the opposite of library(). It disassociates the package from your current session. After detaching a package, you will no longer be able to reference its functions and datasets directly as we did with hsb without reloading it first.

detach(package:faraway, unload = TRUE)

remove.packages() reverses install.packages(), and it removes a package from your computer. To use it again, you will have to reinstall it.

remove.packages("faraway")
Removing package from '/Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/library'
(as 'lib' is unspecified)

10.5 Exercises

  1. Load the parallel package, which has functions for parallel computation. Then run detectCores(), which will tell you how many cores your computer has. (For more on running tasks in parallel, see Functions and Iteration in R, in particular the section on Parallelization in the chapter on Iteration.)

  2. Install the package stargazer. This package contains a function of the same name, stargazer(), which can write tables of model results and summary statistics to a Word document. After installing and loading stargazer, run this code, and take a look at the file it produces.

mod <- lm(mpg ~ am * wt, data = mtcars)

stargazer(mod, type = "html", out = "mod.doc")