Advanced R
June 19, 2023
Generic functions provide a unified interface to methods for objects of a particular class, e.g.
Adelie Chinstrap Gentoo
152 68 124
Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
172 190 197 201 213 231 2
Here, we use the same function, summary()
, on objects of classes factor
and integer
and get different output for each.
summary()
could contain several if-else statements, but
There are 3 main OOP systems in use:
A new OOP system, S7, is in development as a successor to S3 and S4.
An S3 object has a "class"
attribute:
With unclass()
we obtain the underlying object, here an integer vector
stucture()
You can use structure()
to define an S3 object with a class attribute:
$pi
[1] 3.14
$dp
[1] 2
attr(,"class")
[1] "pi_trunc"
Potentially further attributes can be added at the same time, but typically we would use a list to return all the required values.
class()
Alternatively, we can add a class attribute using the class()
helper function:
S3 generic functions are simple wrappers to UseMethod()
useMethod()
The UseMethod()
function takes care of method dispatch: selecting the S3 method according to the class of the object passed as the first argument.
[1] "factor"
[1] Adelie Adelie Adelie
Levels: Adelie Chinstrap Gentoo
Here print()
dispatches to the method print.factor()
.
The class of an S3 object can be a vector of classes
We say fit
is a "glm"
object that inherits from class "lm"
.
The inherits()
function can be used to test if an object inherits from a given class
An S3 object can have more than one class e.g.
UseMethod()
works along the vector of classes (from the first class to the last), looks for a method for each class and dispatches to the first method it finds.
If no methods are defined for any of class, the default is used , e.g. print.default()
.
If there is no default, an error is thrown.
See the methods for a given S3 class
[1] anova coef confint deviance df.residual fitted
[7] formula logLik nobs predict print profile
[13] residuals summary vcov weights
see '?methods' for accessing help and source code
See the methods for a given generic function
[1] coef.aov* coef.Arima* coef.default* coef.listof* coef.maov*
[6] coef.nls*
see '?methods' for accessing help and source code
Asterisked methods are not exported.
S3 methods need not be in the same package as the generic.
Find an unexported method with getS3method()
function (object, complete = TRUE, ...)
{
cf <- object$coefficients
if (complete)
cf
else cf[!is.na(cf)]
}
<bytecode: 0x1157fcfc8>
<environment: namespace:stats>
In code, call the generic, rather than calling the method directly.
The arguments of a new method should be a superset of the arguments of the generic
New methods have the name format generic.class
:
NextMethod()
We can explicitly call the next method that would be called by UseMethod()
to reuse code whilst customising as required
It is possible to call NextMethod()
with arguments but it is safer to recall the generic with new arguments in this case.
is.object()
can be used to find out if an object has a class (S3/S4/R6)
An object that does not have an explicit class has an implicit class that will be used for S3 method dispatch. The implicit class can be found with .class2()
We can take advantage of existing S3 methods by returning an object of a existing S3 class or an implicit class, using attributes to add custom information
[1] "matrix" "array"
V1 V2
Min. :-2 Min. :-4
1st Qu.:-1 1st Qu.:-2
Median : 0 Median : 0
Mean : 0 Mean : 0
3rd Qu.: 1 3rd Qu.: 2
Max. : 2 Max. : 4
This can avoid the need to define new classes and methods, in simple cases.
y
and an explanatory variable x
, that returns an object of a new class "ols"
, that inherits from "lm"
.Note: I have set options(digits = 4)
to limit the number of digits printed by default throughout this presentation (default is 7).
ols
class that uses NextMethod()
to compute the usual lm
summary, but return an object of class "summary.ols"
."summary.ols"
which works as follows:S4 methods
The methods package provides the functions required to use S4 classes and methods, so always load this package when using S4.
An S4 class can be defined with setClass()
, with at least two arguments
UpperCamelCase
.ANY
allows a slot to accept any type of object.A new instance of the S4 object can be created using new()
florence <- new("Person",
name = "Florence Nightingale",
date_of_birth = as.Date("1820-05-12"),
date_of_death = as.Date("1910-08-13"),
age_at_death = 90)
str(florence)
Formal class 'Person' [package ".GlobalEnv"] with 4 slots
..@ name : chr "Florence Nightingale"
..@ date_of_birth: Date[1:1], format: "1820-05-12"
..@ date_of_death: Date[1:1], format: "1910-08-13"
..@ age_at_death : num 90
Note that the second onwards argument names in new
are the names in the vector passed to slots()
when defining the class.
Find the type of S4 class
Extract the value of a slot (use @)
The prototype
argument can be used to specify default values, enabling partial specification
Be sure to use list()
not c()
for prototype
– easy mistake to make!
initialize()
An initialize()
method can be used for more control over initialization
setMethod("initialize", "Person",
function(.Object, ...) {
# initialize with default method
# (named arguments override defaults)
.Object <- callNextMethod(.Object, ...)
# compute age at death if not specified
year <- function(x) as.numeric(format(x, "%Y"))
m_day <- function(x) as.numeric(format(x, "%m%d"))
if (is.na(.Object@age_at_death)){
n_year <- year(.Object@date_of_death) - year(.Object@date_of_birth)
birthday <- m_day(.Object@date_of_death) >= m_day(.Object@date_of_birth)
.Object@age_at_death <- n_year - !birthday
}
.Object
})
florence <- new("Person",
name = "Florence Nightingale",
date_of_birth = as.Date("1820-05-12"))
str(florence)
Formal class 'Person' [package ".GlobalEnv"] with 4 slots
..@ name : chr "Florence Nightingale"
..@ date_of_birth: Date[1:1], format: "1820-05-12"
..@ date_of_death: Date[1:1], format: NA
..@ age_at_death : num NA
Formal class 'Person' [package ".GlobalEnv"] with 4 slots
..@ name : chr "Florence Nightingale"
..@ date_of_birth: Date[1:1], format: "1820-05-12"
..@ date_of_death: Date[1:1], format: "1910-08-13"
..@ age_at_death : num 90
The contains
argument to setClass()
specifies a class or classes to inherit slots and behaviour from
Creating a new instance of the subclass will fill in the slots of the superclass
seriesD_10GBP <- new("BanknoteCharacter",
name = "Florence Nightingale",
date_of_birth = as.Date("1820-05-12"),
date_of_death = as.Date("1910-08-12"))
str(seriesD_10GBP)
Formal class 'BanknoteCharacter' [package ".GlobalEnv"] with 7 slots
..@ denomination : num NA
..@ first_issue : Date[1:1], format: NA
..@ last_legal : Date[1:1], format: NA
..@ name : chr "Florence Nightingale"
..@ date_of_birth: Date[1:1], format: "1820-05-12"
..@ date_of_death: Date[1:1], format: "1910-08-12"
..@ age_at_death : num 90
Use showClass()
to show (print) an S4 Class
If a user is to create these objects, define a helper function named by the class
Use setValidity()
to check constraints beyond data type, e.g. that all slots have the same length
S4 generic functions are (usually) a wrapper to standardGeneric()
, e.g.
standardGeneric for "kronecker" defined from package "base"
function (X, Y, FUN = "*", make.dimnames = FALSE, ...)
standardGeneric("kronecker")
<bytecode: 0x125479598>
<environment: 0x12546c7e0>
Methods may be defined for arguments: X, Y, FUN, make.dimnames
Use showMethods(kronecker) for currently available ones.
By default, all arguments apart from ...
are used for method dispatch.
Use setGeneric
to define a new generic, with the optional signature
argument to specify the arguments to use for method dispatch
setGeneric("myGeneric",
function(x, ..., verbose = TRUE) standardGeneric("myGeneric"),
signature = "x"
)
[1] "myGeneric"
Do not use {}
in the function definition here.
S4 generics use lowerCamelCase
names by convention.
S4 methods for a generic function are defined with setMethod()
, which takes three main arguments
setMethod("show", "Person", function(object) {
cat(object@name, "\n",
"Born: ", format(object@date_of_birth, "%d %B %Y"), "\n",
"Died: ", format(object@date_of_death, "%d %B %Y"),
" (aged ", object@age_at_death, ")\n",
sep = "")
})
florence
Florence Nightingale
Born: 12 May 1820
Died: 13 August 1910 (aged 90)
It is good practice to define generics to get and set slots that the user should have access to.
For example, a generic to get and set the date of birth
Methods can then be defined for multiple classes using the same interface.
Access the date of birth from a Person object
Change the date of birth
Keep it simple: dispatch on one or two arguments usually sufficient.
Avoid ambiguous cases by defining methods earlier in path.
Methods can be defined for the ANY
pseudo-class
The MISSING
pseudo-class is useful for dispatch on two arguments: allow different behaviour if only one argument specified.
Diag
to represent a diagonal matrix with two slots:n
the number of rows/cols
x
the numeric values of the diagonal elements
Add a prototype to specify default values.
initialize
method so that the n
slot is computed automatically and does not have to be provided.Diag()
helper function to create a new Diag
object, with the user only having to specify the diagonal elements.show
method to state the size of the matrix and print the diagonal elements.Main reference for this session, goes a bit further (including R6): Wickham, H, Advanced R (2nd edn), Object-oriented programming section, https://adv-r.hadley.nz/oo.html
Fun example creating Turtle
and TurtleWithPen
classes to create simple graphics by moving the turtle: https://stuartlee.org/2019/07/09/s4-short-guide/
Licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0).