Discuss the following lines of code with a neighbor. What do they do?
“To understand computations in R, two slogans are helpful: Everything that exists is an object. Everything that happens is a function call.”
(Chambers, 2014)
We have already seen some functions, including the sample
function:
and the typeof
function:
Functions provide code to execute some task given a set of inputs.
A function call is a command to execute the code of a function:
function_name(argument1, argument2, ...)
Arguments or parameters are expressions/values that are the inputs to the function.
The parentheses following the name of a function are still required even when there are no arguments:
Whenever you are using a function for the first time, it is good idea to access the documentation by typing ?function_name
into the console.
A formal argument is a named argument that is used in the code of a function.
The function args
displays the formal arguments:
An actual argument t is the value specified by the user during a function call:
The two most common ways to specify arguments are positional and exact:
[1] "H"
Error in sample.int(x, size, replace, prob): invalid 'replace' argument
How can we use functions to compute (feel free to look online):
Last class, we introduced atomic vectors, but we only considered vectors of length one.
Generally, atomic vectors are sets of elements of the same type.
We create vectors using the function c()
We index vectors using [index]
after the vector name:
If we use a negative index, we return the vector with that element removed
Note that atomic vectors can only have one type of data. So the following lines work:
What do you expect the output of the following chunk to be?
What do you expect the output of the following chunk to be?
max()
, min()
, mean()
, median()
, sum()
, sd()
, var()
length()
returns the number of elements in the vectorhead()
and tail()
return the beginning and end vectorssort()
will sortsummary()
returns a 5-number summaryany()
and all()
to check conditions on Boolean vectorshist()
will return a crude histogram (we’ll learn how to make this nicer later)If you are unclear about what any of them do, use ?
before the function name to read the documentation. You should get in the habit of checking function documentation a lot!
The notation a:b
generates integers starting at a
and ending at b
.
The rep
function repeats values of the first argument.
The rnorm
function randomly generates n
elements with the specified mean
and sd
.
matrix()
[,1] [,2] [,3] [,4] [,5]
[1,] 1 3 5 4 2
[2,] 2 4 5 3 1
Warning: be careful not to call your matrix matrix
! Why not?
We can also generate matrices by column binding (cbind()
) and row binding (rbind()
) vectors
Indexing a matrix is similar to indexing a vector, except we must index both the row and column, in that order.
What is the output of the following line?
[1] 3
What is the output of the following line?
[1] 5 3 1
Also similarly to vectors, we can subset using a negative index.
What happened here? When subsetting a matrix reduces one dimension to length 1, R automatically coerces it into a vector. We can prevent this by including drop = FALSE
.
We can also fill in an empty matrix using indices. In R, you should always start by initializing an empty matrix of the right size.
Then I can replace a single row (or column) using indices as follows.
We can also fill in multiple rows (or columns) at once. (Likewise, we can also do subsets of rows/columns, or unique entries). Note that recycling applies here.
Matrices, like vectors, can only have entries of one type.
Let’s create 3 matrices for the purposes of demonstrating matrix functions.
[,1] [,2] [,3]
[1,] 1 2 3
[2,] 4 5 6
[,1] [,2]
[1,] 1 4
[2,] 2 5
[3,] 3 6
Matrix Sums +
Element-wise Matrix Multiplication *
Matrix Multiplication %*%
Column Bind Matrices cbind()
Transpose t()
Column Sums colSums()
Row Sums rowSums()
Column Means colMeans()
Row Means rowMeans()
Dimensions dim()
Determinant det()
Matrix Inverse solve()
Matrix Diagonal diag()
#
)Commenting a code allows you to write notes for readers of your code only
Usually, that reader is you!
Coding without comments is ill-advised, bordering on impossible
Sneak peak at functions…
#' Wald-type t test
#' @param mod an object of class \code{bbdml}
#' @return Matrix with wald test statistics and p-values. Univariate tests only.
waldt <- function(mod) {
# Covariance matrix
covMat <- try(chol2inv(chol(hessian(mod))), silent = TRUE)
if (class(covMat) == "try-error") {
warning("Singular Hessian! Cannot calculate p-values in this setting.")
np <- length(mod$param)
se <- tvalue <- pvalue <- rep(NA, np)
} else {
# Standard errors
se <- sqrt(diag(covMat))
# test statistic
tvalue <- mod$param/se
# P-value
pvalue <- 2*stats::pt(-abs(tvalue), mod$df.residual)
}
# make table
coef.table <- cbind(mod$param, se, tvalue, pvalue)
dimnames(coef.table) <- list(names(mod$param),
c("Estimate", "Std. Error", "t value", "Pr(>|t|)"))
return(coef.table)
}
There are exceptions to every rule! Usually, comments are to help you!
Example of breaking rules
Here’s a snippet of a long mathematical function (lots of code omitted with ellipses for space).
Code is divided into major steps marked by easily visible comments
objfun <- function(theta, W, M, X, X_star, np, npstar, link, phi.link) {
### STEP 1 - Negative Log-likelihood
# extract matrix of betas (np x 1), first np entries
b <- utils::head(theta, np)
# extract matrix of beta stars (npstar x 1), last npstar entries
b_star <- utils::tail(theta, npstar)
...
### STEP 2 - Gradient
# define gam
gam <- phi/(1 - phi)
Being a successful programmer requires commenting your code
Want to understand code you wrote >24 hours ago without comments?
We will be using a mix of the Tidyverse Style Guide by Hadley Wickham and the Google Style Guide. Please see the links for details, but I will summarize some main points here and throughout the class as we learn more functionality, such as functions and packages.
You may be graded on following good code style.
Use either underscores (_
) or big camel case (BigCamelCase
) to separate words within an object name. Do not use dots .
to separate words in R functions!
Names should be concise, meaningful, and (generally) nouns.
It is very important that object names do not write over common functions!
Note: T
and F
are R shorthand for TRUE
and FALSE
, respectively. In general, spell them out to be as clear as possible.
Put a space after every comma, just like in English writing.
Do not put spaces inside or outside parentheses for regular function calls.
Most of the time when you are doing math, conditionals, logicals, or assignment, your operators should be surrounded by spaces. (e.g. for ==
, +
, -
, <-
, etc.)
There are some exceptions we will learn more about later, such as the power symbol ^
. See the Tidyverse Style Guide for more details!
Adding extra spaces ok if it improves alignment of =
or <-
.
Strive to limit your code to 80 characters per line. This fits comfortably on a printed page with a reasonably sized font.
If a function call is too long to fit on a single line, use one line each for the function name, each argument, and the closing )
. This makes the code easier to read and to change later.
Tip! Try RStudio > Preferences > Code > Display > Show Margin with Margin column 80 to give yourself a visual cue!
We use <-
instead of =
for assignment. This is moderately controversial if you find yourself in the right (wrong?) communities.
In R, semi-colons (;
) are used to execute pieces of R code on a single line. In general, this is bad practice and should be avoided. Also, you never need to end lines of code with semi-colons!
Use "
, not '
, for quoting text. The only exception is when the text already contains double quotes and no single quotes.
Comment Style Guide
Frequent use of comments should allow most comments to be restricted to one line for readability
A comment should go above its corresponding line, be indented equally with the next line, and use a single
#
to mark a commentUse a string of
-
or=
to break your code into easily noticeable chunks - Example:# Data Manipulation -----------
- RStudio allows you to collapse chunks marked like this to help with clutter