Matrices in R

Matrix is a vector that also contains information on the number of rows and number of columns. However vectors are not matrices.

Creating Matrices

An important first step with matrices is to learn how to create them. One of the easiest ways to do this is with the matrix() function.

x <- c(1,2,3,4)
x.mat <- matrix(x, nrow=2, ncol=2, byrow=TRUE)
x.mat
##      [,1] [,2]
## [1,]    1    2
## [2,]    3    4

Note: the byrow=TRUE means that we will the matrix by the row, it is not the same as if we do not fill it by row:

x.mat2 <- matrix(x, nrow=2, ncol=2, byrow=FALSE)
x.mat2
##      [,1] [,2]
## [1,]    1    3
## [2,]    2    4

We can also create matrices purely by expressing the number of columns we wish to have. In larger forms of data we may not know the exact amount of rows and columns but certainly we can choose at least the number of columns.

y <- c(1,2,3,4,5,6,7)
y.mat <- matrix(y, ncol=2)
## Warning in matrix(y, ncol = 2): data length [7] is not a sub-multiple or
## multiple of the number of rows [4]
y.mat
##      [,1] [,2]
## [1,]    1    5
## [2,]    2    6
## [3,]    3    7
## [4,]    4    1

Recycling

Notice in the above example that we did not have enough elements in our vector to full fill out the matrix so we have recycled back to the first element to fill in the final cell.

Matrix Operations

R can be a great tool for working with matrices. Many operations we need to do with linear algebra can be done in R. A small selection of these follows:

We can perform elementwise multiplication just like in vectors:

x.mat * x.mat2
##      [,1] [,2]
## [1,]    1    6
## [2,]    6   16

R does have the ability to do matrix multiplication as well

x.mat %*% x.mat2
##      [,1] [,2]
## [1,]    5   11
## [2,]   11   25

We can transpose matrices and extract the diagonals as well

t(x.mat)
##      [,1] [,2]
## [1,]    1    3
## [2,]    2    4
diag(x.mat2)
## [1] 1 4

Another common matrix calculation is the inverse. Many algorithms and functions in statistics need to work with the inverse of matrices:

solve(x.mat)
##      [,1] [,2]
## [1,] -2.0  1.0
## [2,]  1.5 -0.5
x.mat %*% solve(x.mat)
##      [,1]         [,2]
## [1,]    1 1.110223e-16
## [2,]    0 1.000000e+00

The *apply() Function

Many times we wish to use our own function over the elements of a matrix. The apply() function allows someone to use an R function or user-defined function with a matrix. This function is

apply(m, dimcode, f, arguments)
Where,

m: matrix you wish to use.

dimcode

1 if you want to apply function to rows

2 if you want to apply to columns

f: function you wish to use

arguments: specific arguments for function being used.

An apply() Example

We begin with our matrix y.mat. We can use the apply function to get means of either the columns or the rows.

apply(y.mat, 1, mean)
## [1] 3.0 4.0 5.0 2.5
apply(y.mat,2,mean)
## [1] 2.50 4.75

Quick Check Practice


#You will find out more about the runif command in a few weeks.
set.seed(1234)
x = runif(5000, 1, 8)


# Do Not Print X as it is a long vector
# Create a matrix of x with 100 columns and fill it by row
# Label this matrix c
# 1. Find the row means of c.
# 2. Find the column means of c.
# 3. What is the value of the 3rd column and 98th row?


# Do Not Print X as it is a long vector
# Create a matrix of x with 100 columns and fill it by row
# Label this matrix c
c <- matrix(x, ncol=100, byrow=TRUE) # 1. Find the row means of c. apply(c, 1, mean) # 2. Find the column means of c. apply(c, 2, mean) # 3. What is the value of the 3rd column and 98th row? c[3,98]


test_error()
test_correct({test_object("c")}, {test_function("matrix", args = "byrow")})

test_correct({
test_object("c")
}, {
test_function("matrix", args = "ncol")
})
test_function("apply")
test_output_contains("4.062481")
test_function("apply")
test_output_contains("4.058472")
test_output_contains("1.80")
success_msg("Great Job")

Use your knowledge of matrices to answer these questions.

Naming Rows and Columns of Matrices

Just like in vectors we may want to name elements in a matrix. Now we have more than on dimension so we can name both the rows and columns. Consider the following matrices where we have recorded both weight(lbs) and height(inches) of subjects at time point 1.

time1 <- matrix( c(115, 63, 175, 69, 259, 57, 325, 70), ncol=2, byrow=TRUE)
time1
##      [,1] [,2]
## [1,]  115   63
## [2,]  175   69
## [3,]  259   57
## [4,]  325   70

We then have another measurement at time point 2.

time2 <- matrix( c(120, 63, 175, 69, 224, 57, 350, 70), ncol=2, byrow=TRUE)
time2
##      [,1] [,2]
## [1,]  120   63
## [2,]  175   69
## [3,]  224   57
## [4,]  350   70

Without the story behind these we do not know what kind of data we have here or what is being measured. This is where it can be very important to name both the columns and the rows of data.

#Names for Time 1
colnames(time1) <- c("weight1", "height1")
rownames(time1) <- c("Subject 1", "Subject 2", "Subject 3", "Subject 4")
time1
##           weight1 height1
## Subject 1     115      63
## Subject 2     175      69
## Subject 3     259      57
## Subject 4     325      70

We can see that now time1 is much more clear as to what the data contains.

#Names for Time 2
colnames(time2) <- c("weight2", "height2")
rownames(time2) <- c("Subject 1", "Subject 2", "Subject 3", "Subject 4")
time2
##           weight2 height2
## Subject 1     120      63
## Subject 2     175      69
## Subject 3     224      57
## Subject 4     350      70