R Programming 2025
March 25, 2025
Sometimes you reach the limits of R:
In this case it can make sense to code parts in C++.
There are some typical scenarios where C++ is likely to be a good idea
To use C++, you need a working C++ compiler.
On MacOS/Windows there is some setup to do, but it will also set you up to
On Linux, you will usually have a C++ compiler installed, but you might as well get set up to develop R packages too.
Note
If you’ve followed the development tools section of the course prerequisites, you should be fully set-up for this already.
Debian/Ubuntu:
Fedora/RedHat: should be set up already.
Option 1
Then, in the terminal:
Option 2
Rtools.exe, keeping the default settings.Consider an R function add_r() to add two numbers
Here’s how we might write an equivalent add_cpp() function in C++
return;To use add_cpp() in R we need to compile the C++ code and construct an R function that connects to the compiled C++ function.
The R package Rcpp takes care of these steps for us.
One way is to use the cppFunction(), e.g.
After defining add_cpp() via cppFunction(), add_cpp() is available to use as a R function
It is better to define functions in C++ files (extension .cpp). These files will be recognised by RStudio and other IDEs, with the usual benefits.
The C++ file should have these lines at the top:
Rcpp::.Above each function we want to use in R, add // [[Rcpp::export]]
The following is in the file add_cpp2.cpp
sourceCpp()Now we can use sourceCpp() to make the C++ functions available in R
There are a number of benefits to writing C++ code in separate .cpp files, compared to cppFunction()
{ } must be terminated by a ;.= for assignment (<- is not valid).+, -, *, /).==, !=, >, etc)//./* */ delimitersThe basic C++ data types are scalars. Rcpp provides vector versions
| R | C++ (scalar) | Rcpp (vector) |
|---|---|---|
| numeric | double | NumericVector |
| integer | int | IntegerVector |
| character | char | CharacterVector |
| logical | bool | LogicalVector |
Rcpp also provides String as an alternative to char
A C++ for loop has the form
for (initialisation; condition; increment)
i at zeroi is less than ni by 1 after each iteration
++i is equivalent to i = i + 1total += x[i] is equivalent to total = total + x[i]for loop (vector input, scalar output).size() method to find the length of a vectorwhile loop (vector input, scalar output)break to break from a while or for loopcontinue to skip to the next iteration (vs next in R)The following computes the Euclidean distances
\[d_i = \sqrt{(x - y_i)^2}\]
dist(n) is used to create a numeric vector named dist of length n.v(n) would create a vector named v.pow is a standard C++ function for computing a value raised to a power.
Both pow and sqrt are functions from the <cmath> library, see e.g. w3schools C++ math.
To use <cmath> functions in C++ code, we would normally need to include the <cmath> header in our .cpp file. However, Rcpp defines its own version of these functions, so we can use them with just the Rcpp header.
From the menu bar:
//[Rcpp::export]You can also create a C++ file from the new file drop-down in the Files pane, but this will be blank.
sourceCpp() to source in your function.bench::mark() to compare wmean_r(), wmean_cpp() and the stats function weighted.mean().C++ data types do not handle NAs in input well
int: use a length 1 IntegerVector insteaddouble: NAs okay (converted to NAN)char: use String insteadbool: NAs converted to true; use int insteadRcpp vectors handle NAs in the corresponding type
| Rcpp (vector) | NA |
|---|---|
NumericVector |
NA_REAL |
IntegerVector |
NA_INTEGER |
CharacterVector |
NA_STRING |
LogicalVector |
NA_LOGICAL |
Each vector type has a corresponding matrix equivalent: NumericMatrix, IntegerMatrix, CharacterMatrix and LogicalMatrix.
Create a matrix called named m1
(), e.g. m1(3, 2) for the value in row 3, column 2.m1(0, 0)..nrow and .ncol methods to get the number of rows and columnsAs in R, Rcpp provides d/p/q/r functions for the density, distribution function, quantile function and random generation.
Rcpp:: namespace return a vectorR:: namespace (also provided by the Rcpp R package) return a scalarFor example we can sample a single value from a Gamma distribution by extracting the first element from a vector
Or use the R::rgamma() function to sample a single value directly
In a new C++ file, convert the following Gibbs sampler to C++
Create a wrapper function to set the seed as follows:
Benchmark your gibbs_r() and gibbs_c() functions with N = 100 and thin = 10, using your wrapper function to set the seed.
Rcpp provides some “syntactic sugar” to allow us to write C++ code that is more like R code.
One example is operating on rows or columns of matrices. So far we have seen how to update individual elements of a NumericMatrix. Rcpp lets us extract and update whole rows/columns, e.g. in a matrix with three columns, we can set the \(i\)th row with
A whole column would be extracted with mat(_, j).
The vectorized random generation functions are another example of Rcpp sugar.
Rcpp provide many more vectorized functions, for example:
+, -, *, \)==, !, =<)sqrt, pow, …)mean, median, )In addition, Rcpp provides many R-like functions, such as which_max or rowSums, see Unofficial API documentation for a full list.
Recall our distance function from earlier:
With Rcpp vectorization, we can simplify this to:
This example combines row-indexing and a vectorized function, max().
The following R function can be used to simulate the value of \(\pi\):
Convert this to C++, taking advantage of the vectorized Rcpp functions.
What happens if you use 4 instead of 4.0 here?
Similar scope to this module: Gillespie, C and Lovelace, R, Efficient R programming, Rcpp section, https://csgillespie.github.io/efficientR/performance.html#rcpp
Going a bit further: Wickham, H, Advanced R (2nd edn), Rewriting R code in C++ chapter, https://adv-r.hadley.nz/rcpp.html
Not very polished, but broader coverage of Rcpp functionality: Tsuda, M.E., Rcpp for everyone, https://teuder.github.io/rcpp4everyone_en/300_Rmath.html
The Rcpp vignettes, accessed via browseVignettes("Rcpp") or CRAN
The unofficial Rcpp API documentation
Case studies (optimising by improving R code and/or using C++)
Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0).