Matrices in R Programming

A matrix is a table of numbers. In math text, it is conventional to denote matrices with bold letters. For example, consider a matrix D of the prices of the securities on first three days of the week.

D = 
     64    31 
     65    28 
     66    35

This matrix is {3*2} matrix (pronounced "3 by 2") . The number of rows is given first, followed by the number of columns. The Matrix D shows that on the first day, the bond was worth $64 and the stock was worth $31. On the second day the bond was worth $65 and the stock $28. On the third day the bond was worth $66 and the stock $35.

With this data in place, we can answer many analytics questions considering someone was holding these assets in their portfolio. Let's learn how to use matrices in R and then how to perform statistical analysis on them.

Defining a Matrix in R

Let's say we have the above data in the form of a vector:

> price_data <- c(64,31,65,28,66,35)

We know that this data represents three rows with each row containing the bond price and the stock price. In R, we can use this vector to create a matrix using the matrix() function as shown below:

#price data each number pair represents the bond an dstock price
price_data <- c(64,31,65,28,66,35)
#Create matrix of prices
price_matrix <- matrix(price_data, nrow=3,byrow=TRUE)
#print the matrix
price_matrix

Matrix function parameters:

  • The first argument is the data that the matrix function will convert into rows and columns.
  • The second argument nrow indicates that the matrix will have three rows. A similar argument ncol could be used to indicate the number of columns.
  • The third argument byrow indicates how the data should be processed. byrow=TRUE indicates that it should be processed by row, i.e., as data comes in it will first fill the first row, then second and so on. To process data by columns, byrow should be set to FALSE.

When this matrix is printed, it will look as follows:

> price_matrix
     [,1] [,2]
[1,]   64   31
[2,]   65   28
[3,]   66   35
>

Naming Rows and Columns

Currently our matrix doesn't have any names for the rows and columns and it's difficult to understand the data. Our first column represents the prices of the bond and the second column represents the prices of stock. The three rows represent three days, namely, Mon, Tue and Wed.

In R, we can assign names to the matrix using the rownames() and colnames() functions as shown below:

#set row names and column names for price_matrix
rownames(price_matrix) <- c("Mon","Tue", "Wed")
colnames(price_matrix) <- c("Bond","Stock")
#print the matrix
price_matrix

We can now print the matrix:

> price_matrix
    Bond Stock
Mon   64    31
Tue   65    28
Wed   66    35
>

Another easy way to assign names to a matrix is to use the dimnames() function, as shown below. Note the use of list() function which we will discuss in the upcoming lessons. This list has two objects, the first is the vector for row names and the second is the vector for column names.

#Use the dimnames function to assign names to matrix elements
dimnames(price_matrix) <-list(c("Mon","Tue", "Wed"),c("Bond","Stock"))
#print the matrix
price_matrix

Selecting the Matrix Elements

We can select specific elements of a matrix by using the expression D[m, n], i.e., mth row and nth column of matrix D.

Below R script shows some different ways of selecting the matrix elements:

> #print the matrix
> price_matrix
    Bond Stock
Mon   64    31
Tue   65    28
Wed   66    35
> 
> #What is the bond price on 3rd day? (3rd row and 1st column)
> price_matrix[3,1]
[1] 66
> 
> #Get me just the stock prices for all days
> #In this case we want to access all rows, so we will just omit supplying the row
> #stock prices are in 2nd column
> price_matrix[,2]
Mon Tue Wed 
 31  28  35 
> 
> #Get stock and bond price on Tuesday
> #Tuesday is the 2nd row
> price_matrix[2,]
 Bond Stock 
   65    28 
> 
> #Get all prices for Monday and Wednesday
> #Monday and Wednesday data is in row 1 and 3
> price_matrix[c(1,3),]
    Bond Stock
Mon   64    31
Wed   66    35
> 
> #Get stock prices for Tuesday and Wednesday
> #We can use colon for continuing elements
> price_matrix[2:3,2]
Tue Wed 
 28  35 
>

Arithmetic Operations on Matrices

Just like vectors, we can use standard operators +-/* with matrices. Let's take a few examples to understand the matrix arithmetic and also understand a few other matrix operations along the way.

  1. Let's say that you hold 5 quantity each of this bond and stock. We can multiple the price_matrix to get the dollar holding value of your assets. We store this in the matrix portfolio_value.
  2. Now that we have the portfolio values, we can calculate the total portfolio value on each day by adding the value of stocks and bonds. This can be done using the rowSums() function. We store the results in a vector called days_total.
  3. We now have a new vector which contains daily portfolio value totals. But it is not a part of the portfolio_value matrix. We can add the days_total vector to the main matrix using cbind() function.
  4. Since we have the portfolio values over a three day period, we can calculate the average portfolio value during this period. The values are in columns so we can use colMeans() function to calculate the column means. We store these values in a vector called days_average.
  5. Finaly we can add the days_average vector as a new row to our main matrix. Since we are adding a row, we will use rbind() function.
  6. The resulting matrix contains portfolio values along with totals and average over a period of three days.
> #print the matrix
> price_matrix
    Bond Stock
Mon   64    31
Tue   65    28
Wed   66    35
> 
> #You hold 5 quantity ech of bond and stock. What is the value?
> portfolio_value <- price_matrix * 5
> 
> #What is the total portfolio value on each day
> days_total <- rowSums(portfolio_value)
> 
> #print days total
> days_total
Mon Tue Wed 
475 465 505 
> 
> #We can added the days_total vector to the main matrix using cbind()
> portfolio_value_totals <- cbind(portfolio_value,days_total)
> 
> #Calculate average value per day
> days_average <- colMeans(portfolio_value_totals)
> 
> #Add the averages row to the portfolio_value_totals matrix
> final_matrix <-rbind(portfolio_value_totals,days_average)
> 
> #print final_matrix
> final_matrix
             Bond    Stock days_total
Mon           320 155.0000   475.0000
Tue           325 140.0000   465.0000
Wed           330 175.0000   505.0000
days_average  325 156.6667   481.6667
>

Multiplying Matrix with a Vector

In the above example, we assumed the same quantity 5 for both stock and bond and simply multiplied it with the matrix. R took care of multiplying each matrix element with 5 to get us the values. However, what if we have different quantities of stock and bond. In such a case we can store the quantities in a new vector and then do standard matrix multiplication to achieve our results. Note that what we did earlier (multiply by *) is not the standard matrix multiplication for which you should use %*% in R. The calculation and nuances are demonstrated below:

#print the matrix
price_matrix
# Assign quantities - 5 bonds and 3 stocks
quantities <-c(5,3)
# Multiply prices with quantities to get the values.
portfolio_value <- price_matrix %*% diag(quantities)
  • %*% represents standard matrix multiplication.
  • diag() create a diagnal matrix from the vector so that we can then multiply the two matrices.
  • Multiplying the price matrix with the quantity vector will look as follows:
  • Note: If you did not understand this calculation, don't worry about it at this stage. We will learn a lot more about matrix multiplication in future courses. The objective of bringing this up here was to just familiarize you with the possibilities of the advance calculations that can be done.

The resulting matrix will look as follows:

> portfolio_value
    [,1] [,2]
Mon  320   93
Tue  325   84
Wed  330  105
>

Workspace

When you create a new vector, matrix or any other R object, it gets saved into the workspace and is available for you to use in your calculations. These variables can be seen in the Global Environment, i.e., the top-right window in RStudio. You can also access all objects available in the workspace using the ls() command in R console.

Using ls() command:

> ls()
[1] "days_average"           "days_total"             "final_matrix"
[4] "portfolio_value"        "portfolio_value_totals" "price_data"
[7] "price_matrix"          
>
Membership
Learn the skills required to excel in data science and data analytics covering R, Python, machine learning, and AI.
I WANT TO JOIN
JOIN 30,000 DATA PROFESSIONALS

Free Guides - Getting Started with R and Python

Enter your name and email address below and we will email you the guides for R programming and Python.

Saylient AI Logo

Take the Next Step in Your Data Career

Join our membership for lifetime unlimited access to all our data analytics and data science learning content and resources.