By now we have a strong foundational understanding of various concepts essential for time series analysis. The rest of the course will focus on the following:
- A theoretical understanding of the important time series models (White Noise, AutoRegressive (AR), Moving Average (MA), ARMA.
- The ARIMA model and how various time series processes can be explained by ARIMA.
- Simulating and estimating these time series models in R.
- Box-Jenkins (B-J) methodology for time series forecasting.
- A comprehensive case study for time series analysis in R (With code and plots).
Time Series Models
The objective of the following text is to provide a theoretical understanding of the time series models. It can be a bit complex to grasp, however, as we move to practical implementation in the next lessons, things will start to make more sense.
White Noise is the simplest example of a stationary process and is also the foundation of the other models we will discuss. A white noise process is a random process of random variables that are uncorrelated, have mean zero, and a finite variance. In other words, a series is called white noise if it is purely random in nature. Random noise is donated by εt
Plots of white noise series exhibit a very erratic, jumpy, unpredictable behavior. Since the εt are uncorrelated, previous values do not help us to forecast future values. White noise series themselves are quite uninteresting from a forecasting standpoint (they are not linearly forecastable), but they form the building blocks for more general models.
Simple Time Series Models
A simple trend model can be expressed as follows:
yt = b0 + b1t+ εt
- b0 = the y-intercept; where t = 0.
- b1 = the slope coefficient of the time trend.
- t = the time period.
- ŷt = the estimated value for time t based on the model.
- εt = the random error of the time trend.
The big validity pit-fall for simple trend models is serial correlation; if this problem is present, then you will see an artificially high R2 and your slope coefficient may falsely appear to be significant.
There is a visual way to detect serial correlation (correlogram) or you can perform a Dubin-Watson test.
This type of time series model utilizes a time period lagged observation as the independent variable to predict the dependent variable, which is the value in the next time period.
xt = b0 + b1xt-1 + εt
There can be more than one time period lag independent variables. The model assumes that Xt depends only on its own past values, such as p past values – AR(p). So, AR(1) is the first order autoregression meaning that the current value is based on the immediately preceding value. An AR(2) process has the current value based on the previous two values.
A random walk is an example of a non-stationary process. It has no specific mean or variance, however, it exhibits very strong dependence over time, with each observation closely related to its immediate past observations. However, the changes in value or increments follow a white noise process which is stationary.
Random walk is the case of an AR time series model where the predicted value is expected to equal the previous period plus a random error:
xt = b0 + xt-1 + εt
When b0 is not equal to zero, the model is a random walk with a drift, but the key characteristic is a b1 = 1.
εt is the mean zero white noise. The expected value of the error is still zero.
Some important points about Random Walk Process (Advanced material. Not mandatory):
- The mean reverting level for a random walk is not covariance stationary and the technique of first differencing is frequently used to transform an AR model with one time lag variable (AR1) into a model that is covariance stationary.
- If an AR time series is covariance stationary, then the serial correlations for the lag variables are insignificant or they rapidly drop to zero as the number of time period lags rises.
- When the lag coefficient is not statistically different from 1, a unit root exists. Dickey-Fuller test is applied to AR1 model to test for a unit root. If a unit root is present, then the model is not covariance stationary; if this is the case, the independent variable must be transformed, so you can re-model.
When to Use Autoregressive Models?
If serial correlation exists in a simple time series model, the analyst can create an auto-regressive time series with the sample data, where the independent variable is a lagged (prior period) value. The AR model is appropriate where the prior period value is the best predictor for the future period dependent variable value.
Moving Average Models
A moving average model is one when xt depends only on the random error terms which follow a white noise process. A simple moving average is a series xt generated from a white noise series εt by the rule:
xt = εt + βεt-1
Rather than use past values of the forecast variable in a regression, a moving average model uses past forecast errors in a regression-like model.
The above equation is an example of a first order moving average model which is written as MA(1), as we are considering lag only for 1 past value. The general form is MA(q), where xt depends on q past values.
Note that unless β = 0, we will have a nontrivial correlation structure.