Apply Functions in R

In the earlier lessons, we learned about how we can use the for loop to iterate over various R objects. R also provides other functions for implicit looping such as applylapply and sapply which are even easier to use. It is a whole family of functions that are commonly referred to as 'Apply' family of functions.

The apply functions are not just for compactness of code, but also for speed. If speed is an issue, such as when working with large data sets or long-running simulations, one must avoid explicit loops (read 'for' loop, or 'while' loop) as much as possible, because with apply() function and its variants, R can do them a lot faster than you can.

The apply Function

The apply() function applies a simple function over dimensions of a data structure. The apply() function has the following structure:

> args(apply)
function (X, MARGIN, FUN, ...)
  • X is any R structure with dimensions (matrix, data frame, array .. NOT lists or vectors)
  • MARGIN is the dimension number (1 = rows, 2 = columns)
  • FUN is the function to apply (example: mean())
  • ... represents additional arguments to the function


Let's define a simple matrix myMatrix which we will use to understand the apply() function. The rpois() function is a built-in function that can be used to simulate N independent Poisson random variables. For example, we can generate 30 Poisson random numbers with parameter λ = 3 as follows: > rpois(30, 3)

> myMatrix <- matrix(rpois(30,3),5)
> myMatrix
     [,1] [,2] [,3] [,4] [,5] [,6]
[1,]    3    1    7    4    4    3
[2,]    2    3    5    5    2    3
[3,]    2    5    1    2    2    1
[4,]    3    5    1    4    5    3
[5,]    0    0    4    2    1    2

This content is for paid members only.

Join our membership for lifelong unlimited access to all our data science learning content and resources.