We now have a fair idea about how we can use ARIMA modelling in R to estimate and forecast a time series.
This is also called the Box–Jenkins method, named after the statisticians George Box and Gwilym Jenkins, that applies autoregressive moving average (ARMA) or autoregressive integrated moving average (ARIMA) models to find the best fit of a time-series model to past values of a time series.
Box-Jenkins models are quite flexible due to the inclusion of both autoregressive and moving average terms. The Box-Jenkins model assumes that the time series is stationary. Box and Jenkins recommend differencing non-stationary series one or more times to achieve stationarity. Doing so produces an ARIMA model, with the “I” standing for “Integrated”.
We will now go through all the steps involved in ARIMA modelling.
Understanding the abbreviations:
- When a model only involves autoregressive terms it may be referred to as an AR model. When a model only involves moving average terms, it may be referred to as an MA model.
- When no differencing is involved, the abbreviation ARMA may be used.
The examples we have considered are all for non-seasonal data. Box-Jenkins models can be extended to include seasonal autoregressive and seasonal moving average terms. Although this complicates the notation and mathematics of the model, the underlying concepts for seasonal autoregressive and seasonal moving average terms are similar to the non-seasonal autoregressive and moving average terms.
Specifying the Elements of the Model
In an ARIMA model, the elements are specified in the order (AR order, differencing, MA order).
- A model with one AR term would be specified as an ARIMA of order (1,0,0).
- A model with two AR terms would be specified as an ARIMA of order (2,0,0)
- A model with two MA terms (MA(2)) would be specified as an ARIMA of order (0,0,2).
- A model with one AR term, a first difference, and one MA term would be specified as ARIMA of order (1,1,1).
In the last model,
ARIMA (1,1,1), we are applying a model with one AR term and one MA term to the variable zt=xt-xt-1. This is the first difference which is used to account for a linear trend in the data.