What are R Packages and how to install them?

A Japanese Landscape. Image Source: https://www.jstor.org/stable/community.24904880

R packages are extensions to the R programming language. R packages contain code, data, and documentation in a standardized collection format that can be installed by users of R, typically via a centralized software repository such as CRAN (the Comprehensive R Archive Network). In this article, we will have a look at R package system, how to install and update the R packages from CRAN and how to use R packages? Let’s get started…

R Package System

The R system is divided into two conceptual parts:

  • The “base” R system, that you have downloaded from CRAN
  • R Packages

R packages extend the functionality of R by providing additional functions, data, and documentation. They are written by a worldwide community of experienced R users and can be downloaded for free from the internet.

These 30 packages are general enough to solve a certain range of problems. However, if you want to further extend the functionality of R, to cater to your specific requirement, you will have to download some packages from the internet.

A package contains additional functions, which are designed to solve a specific problem. Using a well-designed package, we don’t have to reinvent the wheel again and again, which allows us to focus more on the problem we are trying to solve.

At present there are more than 15000 packages listed on CRAN. This number is large, but you only have to learn small fraction of them. If you focus on work of a specific field, it is very likely that you won’t need more than 10 packages that are related to your field. Therefor, there is no need for you to know all the packages, but handful of those that are most useful and related to your field.

I recommend that you visit CRAN Task Views, and get started by learning about the packages that are most commonly used or closely related to your working field.

Every package distributed on CRAN must meet stringent standards.

“The package structure and repositories of contributed packages have played a major role in the
usefulness and popularity of R. They put some extra burden on providers of the extended software,
particularly if the package is to be accepted by one of the central repositories, notably CRAN, which
is by far the largest and most used site and is associated with the R project itself. CRAN enforces
standards for the documentation, portability and usability of contributed packages. This is more than
compensated by benefits to users in terms of software documentation and testing, and usually benefits
the authors also in the long-term evolution of their software.”

Page 473, S,R and Data Science, The R Journal,2020

Getting to know the package

Let’s take an example of ggplot2 package, a super powerful graphics package. There are several information sources are available if you want to find out more about the package.

  • Package description page: The package description page contains the basic information about the package. If you scroll down a page to Documentation section, you will find the package documentation and code vignettes. Code vignettes are demo code chunks that show how to use various functions in a package.
  • Package Website: The Package website hosts related resources for the package, such as blogs, tutorials, and books. Not every package has a web-site.

Installing packages from CRAN

RStudio provides an easy way to install packages. go to the Packages pane and click on Install. (Please make sure you are connected to the Internet)

In the Packages field, enter package name. Make sure that you tick install dependencies check-box. A package may depend on other packages. In other words, when you call a function in the package, the function also calls some functions in other packages, which requires that you also install those packages as well. install dependencies check-box takes care of this requirement.

You can also install package from your console window of RStudio using install.packages( ) function . (Please make sure you are connected to the Internet) For e.g. to install ggplot2 package, enter following code in console window of RStudio

install.packages("ggplot2",dependencies = TRUE)

You may be asked to select a CRAN mirror, just select ‘0-cloud’ or a mirror near to your location. The dependencies = TRUE argument ensures that additional packages that are required will also be installed.

Updating packages from CRAN

Packages are frequently updated to add new features or to fix bugs. Once package is installed, package version stays fixed. Hence it is good practice to check the package version once in a while by visiting the CRAN Task Views

RStudio provides an Update button next to Install in the package pane. (Please make sure you are connected to the Internet) Select the package name and click on Update button to update the package.

You can update all the installed packages at once, by using update.packages( ) function in RStudio console.

update.packages(ask=FALSE)

The ask=FALSE argument avoids having to confirm every package download which can be a pain if you have many packages installed.

How to use the package

We can use library() function to load the package in current R session, so that all the functions written in it can be used. Packages are not “loaded” by default when you start RStudio on your computer; you need to “load” each package you want to use every time you start RStudio. Let’s load the ggplot2 package. I assume that you have already installed the ggplot2 package on your machine.

To load the ggplot2 package, run the following command in the console pane.

library(ggplot2)

If after running the earlier code, a blinking cursor returns next to the > “prompt” sign, it means you have successfully loaded ggplot2 package. However, if you get following message that reads:

Error in library(ggplot2) : there is no package called ‘ggplot2’

it means you are attempting to load the package which is not installed.

After you load the package, you may try out following code, which displays scatter plot between engine displacement (mapped on x axis) and fuel economy (mapped on y axis). Copy following code and paste it into console and press Enter.

ggplot(mpg, aes(x = displ, y = hwy)) +
geom_point()

Sometimes it can be useful to use a function without first using the library() function. If, for example, you will only be using one or two functions in your script and don’t want to load all of the other functions in a package then you can access the function directly by specifying the package name followed by two colons and then the function name.

ggplot2::ggplot(mpg, aes(x = displ, y = hwy)) +
geom_point()

Here we are not loading the entire package, but only ggplot() function.

One very common mistake new R users make when using particular packages is they forget to “load” them first by using the library() command we just saw. Remember: you have to load each package you want to use every time you start RStudio

Summary

In this chapter you learned how to extend the functionality of R using packages. You learned about R package system:

  • How packages extend the functionality of base R
  • How packages can be installed on your machine
  • How to update the packages for new functionalists and bug fixes
  • How to load the packages in current R session

References:

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top