# Measuring Overall ETFs Performance

We will now plot a graph to show the accumulated returns of the ETFs over a period of time. We can do so by following the following steps:

- Build a dataframe with the 4 ETFs prices and a date column. Calculate daily returns and cumulative returns. The
function returns the cumulative sums (i.e. the sum of all values up to a certain position of a vector).*cumsum()* - Change the format of the data from wide shape to long shape with the
*gather()*function frompackage.*dplyr* - Use
**ggplot2**to graph the ETFs performance in the whole period.

**Step 1: Build a dataframe with the 4 ETFs prices and a date column. Calculate daily returns and cumulative returns.**

```
# Make the cumRets dataframe with the cumulative returns for each of the ETFs using cumsum() function
# The dailyReturn is a built in function from quantmod to get the daily returns.
cumRets <- data.frame(date = index(SPY),
cumsum(dailyReturn(SPY) * 100),
cumsum(dailyReturn(IVV) * 100),
cumsum(dailyReturn(QQQ)* 100),
cumsum(dailyReturn(IWF))* 100)
# Add new names to the columns of cumRets dataframe
colnames(cumRets)[-1] <- etfs
head(cumRets,10)
date SPY IVV QQQ IWF
1 2014-02-03 -2.1352 -2.1233 -2.1136 -2.2921
2 2014-02-04 -1.4347 -1.4382 -1.3780 -1.3709
3 2014-02-05 -1.5602 -1.5686 -1.6371 -1.6752
4 2014-02-06 -0.2414 -0.2345 -0.3619 -0.2958
5 2014-02-07 0.9981 1.0709 1.4220 1.1491
6 2014-02-10 1.1818 1.2092 1.9947 1.4221
7 2014-02-11 2.2762 2.3137 3.1337 2.4401
8 2014-02-12 2.3256 2.3957 3.3251 2.6393
9 2014-02-13 2.8419 2.8814 4.0669 3.3410
10 2014-02-14 3.3938 3.4137 4.2677 3.5616
```

**Step 2: Change the format of the data from wide shape to long shape with the gather function from dplyr package.**

The data is currently wide-shaped because each date’s data is wide. For better analysis, We want the data to be long, where each date of data is in a separate observation.

```
# Make a tidy dataset called longCumRets which group the returns of each ETF in the same column.
# The second and third parameter of the gather function are the new columns names in the tidy dataset.
# Note: You need to install the tidyr package to use the gather function. Use install.packages('tidyr') for installing and library(tidyr) to load it.
longCumRets <- gather(cumRets,symbol,cumReturns,etfs)
head(longCumRets,10)
date symbol cumReturns
1 2014-02-03 SPY -2.1352
2 2014-02-04 SPY -1.4347
3 2014-02-05 SPY -1.5602
4 2014-02-06 SPY -0.2414
5 2014-02-07 SPY 0.9981
6 2014-02-10 SPY 1.1818
7 2014-02-11 SPY 2.2762
8 2014-02-12 SPY 2.3256
9 2014-02-13 SPY 2.8419
10 2014-02-14 SPY 3.3938
```

**Step 3: Use ggplot2 to graph the ETFs performance in the whole period.**

```
# Plot the performance of the ETF's indexes
ggplot(longCumRets, aes(x=date,y=cumReturns,color = symbol)) + geom_line()+ ggtitle("ETF’s Accumulated Returns")
```

As we can observe in the graph, all ETFs are correlated, which means that they are affected by similar drivers. QQQ has the greatest performance in the whole period while SPY and IVV have the lowest performance. At the end of 2018 a big drawdown has affected all ETFs performance.

**A note about reshaping data:**

There are times when our data is considered unstacked and a common attribute of concern is spread out across columns. To reformat the data such that these common attributes are gathered together as a single variable, the gather() function will take multiple columns and collapse them into key-value pairs, duplicating all other columns as needed.

### Measuring Overall ETFs Performance - Yearly Breakdown

To go deeper in analyzing ETFs performance during the period, we will make a breakdown about ETF performance by year. With this picture we can gain an insight into which ETF has the greater and poor performance by year. This can be achieved by following the following steps:

**Step 1:**Transform thedataframe from wide to long creating the*returns*dataframe. With this shape, we would have two new columns that are symbol and returns, which store the daily returns for each symbol in long format.*longrets***Step 2:**Make a new column inwith only the year from the index of returns*longrets**.***Step 3:**Calculate the mean of returns for each group, where the group is composed of a particular symbol-year.

```
# Transform the returns dataframe from wide to long creating the longrets dataframe.
longrets <- gather(returns,symbol,returns,etfs)
head(longrets)
symbol returns
1 SPY -2.1352
2 SPY 0.7005
3 SPY -0.1254
4 SPY 1.3187
5 SPY 1.2396
6 SPY 0.1837
# Take only the years from the index names of returns that have the dates
longrets$year <- substr(rownames(returns),1,4)
head(longrets)
symbol returns year
1 SPY -2.1352 2014
2 SPY 0.7005 2014
3 SPY -0.1254 2014
4 SPY 1.3187 2014
5 SPY 1.2396 2014
6 SPY 0.1837 2014
# The aggregate function can split a data frame columns by groups and then
# apply a function to these groups. In our case the column that we are
# interested to split is the returns column and the groups are symbol and year.
# Finally we will apply the mean function to each of these groups.
groupedReturns <- aggregate(longrets$returns, list(symbol=longrets$symbol,year=longrets$year), mean)
groupedReturns
symbol year x
1 IVV 2014 0.065175325
2 IWF 2014 0.062363203
3 QQQ 2014 0.082309524
4 SPY 2014 0.064810823
5 IVV 2015 0.001005556
6 IWF 2015 0.020585714
7 QQQ 2015 0.038149603
8 SPY 2015 0.001565079
9 IVV 2016 0.040590079
10 IWF 2016 0.024551190
11 QQQ 2016 0.028030159
12 SPY 2016 0.039951587
13 IVV 2017 0.071888048
14 IWF 2017 0.100740637
15 QQQ 2017 0.111167729
16 SPY 2017 0.071519124
17 IVV 2018 -0.020522311
18 IWF 2018 -0.003539841
19 QQQ 2018 0.006568924
20 SPY 2018 -0.020306375
21 IVV 2019 0.088256376
22 IWF 2019 0.111604027
23 QQQ 2019 0.112140940
24 SPY 2019 0.088615436
```

The ** groupedReturns** dataframe provides useful insights about the mean returns of each ETF by year. We can see that the best symbol-year was QQQ 2019 with mean returns of 11%, while the poor symbol-year performance was IVV 2018 with a mean return of -2%. It is very appealing to show these insights graphically.

```
annReturnBars <- ggplot(groupedReturns,aes(x=symbol,y=x)) +
geom_bar(stat="identity",aes(fill=symbol))+
facet_wrap(~year,ncol=2) +
theme(legend.position = "right") + ggtitle("ETF mean returns by Year")
annReturnBars
```

In this bar plot chart we can observe that the QQQ ETF has the highest performance in all years, except for 2016 where the most performant was IVV ETF. Also we can observe that 2018 was the worst year of the period, where only QQQ has slightly positive returns. On the other hand, 2017 were a good year for all ETFs, and up to this moment 2019 is a good year too.

In the next section, we would start with the quantstrat package that provides an infrastructure to build, backtest and analyze trading strategies.

#### Course Downloads

Get smart about tech at work.

As a non-technical professional, learn how software works with simple explanations of tech concepts. Learn more...

- Introduction to Quantitative Trading
- Quantitative Trading - Advantages and Disadvantages
- Types of Quantitative Trading Strategies
- Momentum Strategies
- Mean Reversion Strategies
- Market Making Strategies and Day Trading Strategies
- How to Generate Trading Ideas
- Designing A Trading Strategy For Profit
- Backtesting a Trading Strategy - Considerations
- Risk Management of a Trading Strategy
- Risk Indicators - VIX Index and TED Spread
- Plotting the VIX Index and TED Spread in R
- Introduction to Quantmod in R
- Downloading Data Using Quantmod Package in R
- Creating Charts with Quantmod
- Data Analysis with Quantmod in R
- Measuring Overall ETFs Performance
- Quantstrat Example in R - EMA Crossover Strategy
- Quantstrat - EMA Crossover Strategy - Performance and Risk Metrics
- Quantstrat Example in R - RSI Strategy
- Quantstrat Case Study - Multiple Symbol Portfolio

# Data Science for Finance Bundle: 43% OFF

**Data Science for Finance Bundle**for just $29 $51.