Exploring Time Series Data in R

Let's look at a few commands that we will frequently use while exploring time series data.

length()

The length() function tells us the number of elements in out time series dataset.

> length(msft_ts)
[1] 252
>

head()

The head() function displays the top n elements of the dataset. This is useful while exploring large datasets.

> head(msft_ts,n=10)
 [1] 54.80 55.05 54.05 52.17 52.33 52.30 52.78 51.64 53.11 50.99
>

tail()

The tail() function displays the last n elements of the dataset. This is useful while exploring large datasets.

> tail(msft_ts,n=10)
 [1] 62.30 63.62 63.54 63.54 63.55 63.24 63.28 62.99 62.90 62.14
>

Example 2 - Quarterly GDP Data

Let's take one more example, this time quarterly GDP data. We will load this data from Quandl. The following command will load the quarterly GDP data from Quandl for the years 2014 to 2016

> GDP_data = Quandl("FRED/GDP", start_date="2014-01-01", end_date="2016-12-31",type="ts")

Let's explore this data.

print()

We earlier saw that we can also use the print() function to display the time series.

> print(GDP_data)
        Qtr1    Qtr2    Qtr3    Qtr4
2014 17025.2 17285.6 17569.4 17692.2
2015 17783.6 17998.3 18141.9 18222.8
2016 18281.6 18450.1 18675.3 18869.4
>

As we can see, this data is presented as yearly data with 4 observations in each year.

start() and end()

> start(GDP_data)
[1] 2014    1
> end(GDP_data)
[1] 2016    4
>

frequency()

This tells us number of observations per unit of time.

> frequency(GDP_data)
[1] 4
>

The data is quarterly, so the lag between successive observations is 1 quarter. The dataset GDP_data has been set up so that the unit of time is 1 year (frequency=4).

deltat()

This is the fraction of the sampling period between successive observations; e.g., 1/12 for monthly data, and 1/4 for quarterly data. The function deltat uses this time unit to compute the lag by the formula ∆t= 1/frequency). Only one of frequency or deltat should be provided.

> deltat(GDP_data)
[1] 0.25
>

time()

The functions time and cycle create time series of the times at which the observations in a time series are taken and their "seasons".

> time(GDP_data)
        Qtr1    Qtr2    Qtr3    Qtr4
2014 2014.00 2014.25 2014.50 2014.75
2015 2015.00 2015.25 2015.50 2015.75
2016 2016.00 2016.25 2016.50 2016.75
>
> cycle(GDP_data)
     Qtr1 Qtr2 Qtr3 Qtr4
2014    1    2    3    4
2015    1    2    3    4
2016    1    2    3    4
>