Graphics

R Foundations Course

Ella Kaye | Department of Statistics | University of Warwick

December 5, 2023

Overview

Plots in base R
ggplot2
Tables

Packages

library(ggplot2) # for later in the session
library(palmerpenguins)

Plots in base R

No frills

Base R graphics are useful for quick, exploratory “no-frills” plots.

(For anything better looking or more complex or where you want more control, use ggplot2)

boxplot(penguins$body_mass_g)

with(penguins, boxplot(body_mass_g ~ species))

Histogram/Density

Histogram
Density

hist(penguins$body_mass_g)

plot(density(penguins$body_mass_g, na.rm = TRUE))

Scatterplots

Vectors
Data frame

plot(1:10, 1:10)

plot(bill_length_mm ~ bill_depth_mm, 
     data = penguins)

Plot methods

Many different objects in R have defined plot methods:

methods(plot)

 [1] plot,ANY-method     plot,color-method   plot.acf*          
 [4] plot.data.frame*    plot.decomposed.ts* plot.default       
 [7] plot.dendrogram*    plot.density*       plot.ecdf          
[10] plot.factor*        plot.formula*       plot.function      
[13] plot.ggplot*        plot.gtable*        plot.hcl_palettes* 
[16] plot.hclust*        plot.histogram*     plot.HoltWinters*  
[19] plot.isoreg*        plot.lm*            plot.medpolish*    
[22] plot.mlm*           plot.ppr*           plot.prcomp*       
[25] plot.princomp*      plot.profile.nls*   plot.R6*           
[28] plot.raster*        plot.spec*          plot.stepfun       
[31] plot.stl*           plot.table*         plot.trans*        
[34] plot.ts             plot.tskernel*      plot.TukeyHSD*     
see '?methods' for accessing help and source code

e.g. if you call plot on an object of type lm, it will call plot.lm

Linear model diagnostic fits

fit <- lm(bill_length_mm ~ bill_depth_mm, 
          data = penguins, subset = species == "Gentoo")

par(mfrow=c(2,2)) # see all four plots together
plot(fit)

Your turn!

From the starting point of plot(1:10, 1:10), experiment with the arguments type and pch. See ?plot

Can you create a plot with triangular points linked by lines?

Can you do the same with the lines() function? What are the similarities and differences?

ggplot2

Intro to ggplot2

From https://ggplot2.tidyverse.org:

R has several systems for making graphs, but ggplot2 is one of the most elegant and most versatile. ggplot2 implements the grammar of graphics, a coherent system for describing and building graphs. With ggplot2, you can do more faster by learning one system and applying it in many places.

From https://r4ds.had.co.nz/data-visualisation.html:

You provide the data, tell ggplot2 how to map variables to aesthetics, what graphical primitives to use, and it takes care of the details.

ggplot2 usage

ggplot2 is part of the tidyverse.
It has been around for over 10 years and is used by hundreds of thousands of people.
It can take some getting used to, but is worth the investment to learn properly

ggplot2 key components

Every ggplot2 plot has three key components:

Data (typically in a data frame),
A set of aesthetic mappings between variables in the data and visual properties, and
At least one layer which describes how to render each observation. Layers are usually created with a geom_ function.

Cake!

Image credit: Tanya Shapiro

Initiate with data

Package is ggplot2 but function is ggplot()

ggplot(penguins)

Add aesthetics

ggplot(penguins, aes(x = bill_length_mm, y = bill_depth_mm))

Add points

Layers are added with + (not %>% or |>)

ggplot(penguins, aes(x = bill_length_mm, y = bill_depth_mm)) +
  geom_point()

Careful what goes in `aes()`

ggplot(penguins, 
       aes(x = bill_length_mm, 
           y = bill_depth_mm)) +
  geom_point(aes(color = "blue"))

ggplot(penguins, 
       aes(x = bill_length_mm, 
           y = bill_depth_mm)) +
  geom_point(color = "blue")

When you do want to map a colour to data

plot
code

ggplot(data = penguins,
       aes(x = bill_length_mm,
           y = bill_depth_mm,
           group = species)) +
  geom_point(aes(color = species, 
                 shape = species),
             size = 3,
             alpha = 0.8)

Note that color and shape are inside aes() but size and alpha are outside.

Add additional geoms

plot
code

ggplot(data = penguins,
       aes(x = bill_length_mm,
           y = bill_depth_mm,
           group = species)) +
  geom_point(aes(color = species, 
                 shape = species),
             size = 3,
             alpha = 0.8) +
  geom_smooth(method = "lm", aes(color = species))

Scales in ggplot2

Scales in ggplot2 control the mapping from data to aesthetics. They take your data and turn it into something that you can see, like size, colour, position or shape. They also provide the tools that let you interpret the plot: the axes and legends.

Three groups of scales:

position scales and axes
colour scales and legends
scales for other aesthetics

Add a colour scale

plot
code

ggplot(data = penguins,
       aes(x = bill_length_mm,
           y = bill_depth_mm,
           group = species)) +
  geom_point(aes(color = species, 
                 shape = species),
             size = 3,
             alpha = 0.8) +
  geom_smooth(method = "lm", aes(color = species)) +
  scale_color_manual(values = c("darkorange","purple","cyan4"))

Facets

plot 1
code 1
plot 2
code 2

ggplot(data = penguins,
       aes(x = bill_length_mm,
           y = bill_depth_mm,
           group = species)) +
  geom_point(aes(color = species, 
                 shape = species),
             size = 3,
             alpha = 0.8) +
  geom_smooth(method = "lm", aes(color = species)) +
  scale_color_manual(values = c("darkorange","purple","cyan4")) +
  facet_wrap(~species)

ggplot(data = penguins,
       aes(x = bill_length_mm,
           y = bill_depth_mm,
           group = species)) +
  geom_point(aes(color = species, 
                 shape = species),
             size = 3,
             alpha = 0.8) +
  geom_smooth(method = "lm", aes(color = species)) +
  scale_color_manual(values = c("darkorange","purple","cyan4")) +
  facet_wrap(~species, scales = "free_x")

Add theme elements

plot
code

ggplot(data = penguins,
       aes(x = bill_length_mm,
           y = bill_depth_mm,
           group = species)) +
  geom_point(aes(color = species, 
                 shape = species),
             size = 3,
             alpha = 0.8) +
  geom_smooth(method = "lm", aes(color = species)) +
  scale_color_manual(values = c("darkorange","purple","cyan4")) +
  labs(title = "Penguin bill dimensions",
       x = "bill length (mm)",
       y = "bill depth (mm)") +
  theme_minimal() + 
  theme(plot.title.position = "plot",
        text = element_text(size = 20))

Your turn!

Recreate the base R plots from the first part of this session in ggplot2.

You may find the list of available geoms (and their help pages) useful:

https://ggplot2.tidyverse.org/reference/index.html#layers

Boxplot 1

ggplot(penguins) +
  geom_boxplot(aes(y = body_mass_g))

Notes

aes() can be defined for the whole plot or in the geom
first arguments to aes() are x and y (don’t need to name them if using them in that order)

Boxplot 2

ggplot(penguins) +
  geom_boxplot(aes(species, body_mass_g))

Histogram

ggplot(penguins) +
  geom_histogram(aes(body_mass_g), 
                 binwidth = 500)

Density

ggplot(penguins) +
  geom_density(aes(body_mass_g))

Scatterplot with vectors

ggplot(data = NULL, aes(x = 1:10, y = 1:10)) +
  geom_point()

Extensions

See extensions at https://exts.ggplot2.tidyverse.org/gallery/

Inspiration

R can be used to make incredible data visualisations.

Check out the galleries of these data viz practitioners working with ggplot2:

Also, #TidyTuesday on Mastodon is a great source for further inspiration

ggplot2 resources

R for Data Science book: Chapters 3: Data Visualisation and 28: Graphics for Communication, to get up and running quickly
ggplot2 book, for an in-depth understanding
Plotting anything with ggplot2 webinar with Thomas Lin Pederson (one of the main ggplot2 authors)
R graphics cookbook, a practical guide that provides more than 150 recipes to help you generate high-quality graphs quickly
Cedric Scherer’s tutorial
Cedric Scherer’s ‘Engaging and Beautiful Data Visualizations with ggplot2’ workshop
ggplot2 reference

Data visualisation resources

Books about greating good data viz:

Plots in RStudio

Viewing and saving plots in RStudio

In RStudio, graphs are displayed in the Plots window. The plot is sized to fit the window and will be rescaled if the size of the window is changed.

Back and forward arrows allow you to navigate through graphs that have been plotted.

Plots can be saved in various formats using the Export drop down menu, which also has an option to copy to the clipboard.

DEMO

Tables

Getting started with tables

We’re just going to scratch the surface of this today.

We’ll be using the gt and gtsummary packages, but there are many of other.

Here’s a good overview of many different packages.

gt

gt is an R package to create tables. It provides a grammar of tables.

The gt philosophy: we can construct a wide variety of useful tables with a cohesive set of table parts. It all begins with table data (be it a tibble or a data frame). You then decide how to compose your gt table with the elements and formatting you need for the task at hand. Finally, the table is rendered by printing it at the console, including it in an R Markdown document, or exporting to a file using gtsave()

Parts of a gt table

From https://gt.rstudio.com

An example

From Albert Rapp’s gt book

Resources

See the article Case Study: gtcars for a thorough example of gt’s capabilities.

Guidelines for better tables

Having the technical know-how to code tables is one thing, making them look good and such that the reader can easily read the data is another!

Highly recommend this Tom Mock guide, based on Jon Schwabish’s original. It covers guidelines for making better tables, and shows how to implement them in gt. It demonstrates even more of what gt can do than the article on the previous slide.

gtsummary

gtsummary extends the gt package and is used for summarising tables and working with statistical model summaries.

gtsummary example 1: data

library(gtsummary)
# make dataset with a few variables to summarize
trial2 <- trial |> select(age, grade, response, trt)

head(trial2)

# A tibble: 6 × 4
    age grade response trt   
  <dbl> <fct>    <int> <chr> 
1    23 II           0 Drug A
2     9 I            1 Drug B
3    31 II           0 Drug A
4    NA III          1 Drug A
5    51 III          1 Drug A
6    39 I            0 Drug B

gtsummary example 1: code

# summarize and augment the data
summary_table <- 
  tbl_summary(
    trial2,
    by = trt, # split table by group
    missing = "no" # don't list missing data separately
  )  |> 
  add_n() |> # add column with total number of non-missing observations
  add_p() |> # test for a difference between groups
  modify_header(label = "**Variable**") |> # update the column header
  bold_labels()

gtsummary example 1: output

Variable	N	Drug A, N = 98¹	Drug B, N = 102¹	p-value²
Age	189	46 (37, 59)	48 (39, 56)	0.7
Grade	200			0.9
I		35 (36%)	33 (32%)
II		32 (33%)	36 (35%)
III		31 (32%)	33 (32%)
Tumor Response	193	28 (29%)	33 (34%)	0.5
¹ Median (IQR); n (%)
² Wilcoxon rank sum test; Pearson’s Chi-squared test

gtsummary example 2: code

mod1 <- glm(response ~ trt + age + grade, trial, family = binomial)

regression_tab <- tbl_regression(mod1, exponentiate = TRUE)

gtsummary example 2: output

Characteristic	OR¹	95% CI¹	p-value
Chemotherapy Treatment
Drug A	—	—
Drug B	1.13	0.60, 2.13	0.7
Age	1.02	1.00, 1.04	0.10
Grade
I	—	—
II	0.85	0.39, 1.85	0.7
III	1.01	0.47, 2.15	>0.9
¹ OR = Odds Ratio, CI = Confidence Interval

Table inspiration

The winners of the RStudio Table Contest

2022
2021
2020, also has links to tutorials

End matter

License

Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0).

Graphics

Overview

Packages

Plots in base R

No frills

Boxplots

Histogram/Density

Scatterplots

Plot methods

Linear model diagnostic fits

Your turn!

ggplot2

Intro to ggplot2

ggplot2 usage

ggplot2 key components

Cake!

Initiate with data

Add aesthetics

Add points

Careful what goes in aes()

When you do want to map a colour to data

Add additional geoms

Scales in ggplot2

Add a colour scale

Facets

Add theme elements

Your turn!

Boxplot 1

Boxplot 2

Histogram

Density

Scatterplot with vectors

Extensions

Inspiration

ggplot2 resources

Data visualisation resources

Plots in RStudio

Viewing and saving plots in RStudio

Tables

Getting started with tables

gt

Parts of a gt table

An example

Resources

Guidelines for better tables

gtsummary

gtsummary example 1: data

gtsummary example 1: code

gtsummary example 1: output

gtsummary example 2: code

gtsummary example 2: output

Table inspiration

End matter

License

Careful what goes in `aes()`