In the earlier lessons, we learned about how we can use the `for`

loop to iterate over various R objects. R also provides other functions for implicit looping such as `apply`

, `lapply`

and `sapply`

which are even easier to use. It is a whole family of functions that are commonly referred to as ‘Apply’ family of functions.

The `apply`

functions are not just for compactness of code, but also for speed. If speed is an issue, such as when working with large data sets or long-running simulations, one must avoid explicit loops (read ‘for’ loop, or ‘while’ loop) as much as possible, because with `apply()`

function and its variants, R can do them a lot faster than you can.

### The `apply`

Function

The `apply()`

function applies a simple function over dimensions of a data structure. The `apply()`

function has the following structure:

```
> args(apply)
function (X, MARGIN, FUN, ...)
```

- X is any R structure with dimensions (matrix, data frame, array .. NOT lists or vectors)
- MARGIN is the dimension number (1 = rows, 2 = columns)
- FUN is the function to apply (example:
`mean()`

) - … represents additional arguments to the function

### Example

Let’s define a simple matrix `myMatrix`

which we will use to understand the `apply()`

function. The `rpois()`

function is a built-in function that can be used to simulate N independent Poisson random variables. For example, we can generate 30 Poisson random numbers with parameter λ = 3 as follows: `> rpois(30, 3)`

```
> myMatrix <- matrix(rpois(30,3),5)
> myMatrix
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 3 1 7 4 4 3
[2,] 2 3 5 5 2 3
[3,] 2 5 1 2 2 1
[4,] 3 5 1 4 5 3
[5,] 0 0 4 2 1 2
>
```

Now let’s take a few examples to understand how we can use the `apply()`

function.

Calculate the `sum`

of each row.

```
> apply(myMatrix,1,sum)
[1] 22 20 13 21 9
```

The `apply`

function applied the function `sum`

to each row (specified with MARGIN=1) of the matrix `myMatrix`

.

We can do the same column-wise by changing MARGIN=2, as shown below:

```
> apply(myMatrix,2,sum)
[1] 10 14 18 17 14 12
```

The following two examples show row minima and column maxima:

```
> apply(myMatrix,1,min) #row minima
[1] 1 2 1 1 0
> apply(myMatrix,2,max) #col maxima
[1] 3 5 7 5 5 3
>
```

The `apply`

function can be used with matrixes, arrays, and data frames. There are other functions which can be applied to lists and vectors.

### The `lapply()`

Function

The analog of `apply()`

for lists is `lapply()`

. It applies the given function to all elements of the specified list.

**Example**

```
> lapply(list(1:5,20:26),median)
[[1]]
[1] 3
[[2]]
[1] 23
>
```

As you can see, the `lapply()`

function returns a list (always). You can use the `unlist()`

to flatten the list and produce a simple vector.

```
> unlist(lapply(list(1:5,20:26),median))
[1] 3 23
```

We can also use the `apply`

family of functions on our own custom functions as shown below:

```
#The custom oilCost function - oil cost for two gallons
oilCost <− function ( x=0 ) {
output <− x*2
}
#sample vector containing oil prices for the past 5 days (per gallon)
oil_price <- c(10,9,8,9,11)
#Use apply function to calculate two gallon oil cost each day
lapply(oil_price,oilCost)
```

Result will be a list with cost for two gallons.

```
> lapply(oil_price,oilCost)
[[1]]
[1] 20
[[2]]
[1] 18
[[3]]
[1] 16
[[4]]
[1] 18
[[5]]
[1] 22
>
```

Let’s say our custom function allows us to specify the quantity (instead of the fixed quantity of 2). We can modify our function as follows:

```
#The custom oilCost function - oil cost for a specified quantity of gallons
oilCost <− function ( x,qty ) {
output <− x*qty
}
#sample vector containing oil prices for the past 5 days (per gallon)
oil_price <- c(10,9,8,9,11)
#Use apply function to calculate three gallon oil cost each day
unlist(lapply(oil_price,oilCost,qty=3))
#Use apply function to calculate four gallon oil cost each day
unlist(lapply(oil_price,oilCost,qty=4))
```

The function `oilCost`

is applied to the vector `oil_price`

and the `qty`

is supplied as the third argument.

**Result:**

```
> #Use apply function to calculate three gallon oil cost each day
> unlist(lapply(oil_price,oilCost,qty=3))
[1] 30 27 24 27 33
>
> #Use apply function to calculate three gallon oil cost each day
> unlist(lapply(oil_price,oilCost,qty=4))
[1] 40 36 32 36 44
```

### The `sapply()`

Function

As we saw, the `lapply`

function takes a list or a vector, but always returns a list. This is because the input list can have different elements of different classes. That’s why the output must be a list to accomodate output elements having different classes. However, there are many scenarios where all the elements of the output are of the same class (e.g. integer), and we would rather get the output as a vector. Earlier we achieved it using the `unlist`

function. However, the `sapply`

is a better choice as it automatically does this for us. It takes a list or a vector as an input and returns a vector or a matrix when possible. internally, it calls `lapply`

and simplifies the output. The following example shows the difference between `lapply`

and `sapply`

.

```
> myList <- list(Pois = rpois(10,3), Norm = rnorm(5), Unif = runif(5,1,10))
> myList
$Pois
[1] 7 5 3 3 2 2 0 1 4 4
$Norm
[1] 1.56840563 -0.08843082 0.19965929 0.20645959 0.85374249
$Unif
[1] 8.729636 8.109225 7.884665 9.558795 1.016197
> lapply(myList,mean)
$Pois
[1] 3.1
$Norm
[1] 0.5479672
$Unif
[1] 7.059704
> sapply(myList,mean)
Pois Norm Unif
3.1000000 0.5479672 7.0597036
>
```

In this lesson, we saw three functions in the `apply`

family, namely, `apply()`

, `lapply()`

, and `sapply()`

. The entire family is made up of the `apply()`

, `lapply()`

, `sapply()`

, `vapply()`

, `mapply()`

, `rapply()`

, and `tapply()`

functions.