set.seed(100)
<- list(
matrix_list A = diag(5),
B = matrix(rnorm(9), nrow = 3, ncol = 3),
C = matrix(c(1, 2, 3, 4), nrow = 2, ncol = 2),
D = diag(c(1:5))
)
Lab 6
Remember, follow the instructions below and use R Markdown to create a pdf document with your code and answers to the following questions on Gradescope. You may find a template file by clicking “Code” in the top right corner of this page.
A. Basic functions
Use the following code to create a list of four matrices:
Use the
lapply
function to create a list of length four containing the inverse of these four matrices.Use the
sapply
function to create a vector of length four containing the determinants of these four matrices.
B. Skewness and Kurtosis
Skewness describes how asymmetrical the distribution of a numerical variable is around its mean. Given observations \(x_1,\ldots, x_n\), we can calculate the sample skewness \(s\) of a variable using the following formula:
\[s = \frac{\frac{1}{n}\sum\limits_{i=1}^n(x_i-\overline{x})^3}{\left[\frac{1}{n}\sum\limits_{i=1}^n(x_i-\overline{x})^2\right]^{3/2}}\] Kurtosis is a measure of the “tailedness” of the distribution of a numerical variable is around its mean. Higher values of kurtosis indicate more extreme outliers. Given observations \(x_1,\ldots, x_n\), we can calculate the sample kurtosis \(k\) of a variable using the following formula:
\[k = \frac{\frac{1}{n}\sum\limits_{i=1}^n(x_i-\overline{x})^4}{\left[\frac{1}{n}\sum\limits_{i=1}^n(x_i-\overline{x})^2\right]^{2}}-3\]
Write a function
skewness()
that takes as input a numeric vectorx
and returns the sample skewness. There are functions in R that compute skewness, but you cannot use any of them–write your own implementation. You may remove allNA
values by default. Use your function to compute the sample skewness of thearr_delay
variable in theflights
dataset contained in thenycflights13
package.Write a function
kurtosis()
that takes as input a numeric vectorx
and returns the sample skewness. There are functions in R that compute kurtosis, but you cannot use any of them–write your own implementation. You may remove allNA
values by default. Use your function to compute the sample kurtosis of thearr_delay
variable in theflights
dataset contained in thenycflights13
package.Write a function
get_column_skewness()
that takes as input a data frame and calculates the skewness of each numeric variable. The output should be a data frame with two variables:variable
containing the name of the variable andskewness
containing the skewness. Your output data frame should only include the numeric variables. You may remove allNA
values by default. Demonstrate your function on thepenguins
dataset.
C. Finding an error
Suppose you have two teams of runners participating in a 5k. We wish to write a function that takes as input two vectors representing the times of the runners in each team and returns a list of two vectors representing the ranks of each team’s runners.
For example, if the first team’s times are c(16.8, 21.2, 19.1)
and the second team’s times are c(17.2, 18.1, 20.0)
, the function should return c(1, 6, 4)
for the first team and c(2, 3, 5)
for the second team.
Below is a draft version of the function get_runner_ranks()
. However, there is an error somewhere. Use any method we discussed in class to identify the error.
<- function(x, y) {
get_runner_ranks # combine all runner times
<- c(x, y)
combined_times
# sort all runner times from fastest to slowest
sort(combined_times, decreasing = T)
# create ranks vectors
<- numeric(length(x))
ranks_x <- numeric(length(y))
ranks_y
for (i in seq_along(ranks_x)) {
# look up rank of time i in x in combined_times
<- match(x[i], combined_times)
ranks_x[i]
}
for (i in seq_along(ranks_y)) {
# look up rank of time i in y in combined_times
<- match(y[i], combined_times)
ranks_y[i]
}
# return a list of first team and second team ranks
return(list(x = ranks_x, y = ranks_y))
}
Explain in your own words what the error was.
Below, write a corrected version of
get_runner_ranks()
and computeget_runner_ranks(c(16.8, 21.2, 19.1), c(17.2, 18.1, 20.0))
.
<- function(x, y) {
get_runner_ranks # YOUR CODE HERE
}