ARIMA Modelling - Identify Model for a Time Series

The first step is to identify a possible model for a given time series. To do so, we need three things: a time series plot of the data, ACF plot and the ACF plot. Analysis of these three plots can help us fairly identify the suitable model.

Observing the Time Series Plot

  • The very first thing that we should do with our data is to create a time series plot and look for patterns such as trends, seasonality, outliers, and non-constant variance.
  • If we spot an obvious uptrend or down trend, we may need to consider first order differencing. If the time series exhibits a quadratic trend we may want to go for 2nd order differencing.
  • Generally, we don't go beyond two levels of differencing. We don't want to introduce overdifferencing.
  • If the series is still not stationary after two levels of differencing, then we may want to go for smoothing.
  • If the data exhibits a curved upward trend along with non-constant variance such as increasing variance, we can transform the data series by either a logarithm or a square root. Data series with non-constant variance may also call for more advanced models such as ARCH (Not covered in this course).

ACF and PACF Plots

We should consider ACF and PACF plots together to identify the order (i.e., the p and q) of the autoregressive and moving average terms.

Important Note: If the ACF and PACF do not tail off, but instead have values that stay close to 1 over many lags, the series is non-stationary and differencing will be needed. Try a first difference and then look at the ACF and PACF of the differenced data.

Identification of an MA model is often best done with the ACF rather than the PACF.

  • For an MA model, the theoretical PACF does not shut off, but instead tapers toward 0 in some manner. A clearer pattern for an MA model is in the ACF. The ACF will have non-zero autocorrelations only at lags involved in the model.
  • A sample ACF with a significant autocorrelation only at lag 1 is an indicator of a possible MA(1) model.
  • A sample ACF with significant autocorrelations at lags 1 and 2, but non-significant autocorrelations for higher lags indicates a possible MA(2) model.
  • A property of MA(q) models in general is that there are non-zero autocorrelations for the first q lags and autocorrelations = 0 for all lags > q.

So, to identify the order of moving average process, we examine the sample autocorrelation function to see where it essentially becomes zero.

Identification of an AR model is often best done with the PACF.

For an AR model, the theoretical PACF “shuts off” past the order of the model. The phrase “shuts off” means that in theory the partial autocorrelations are equal to 0 beyond that point. Put another way, the number of non-zero partial autocorrelations gives the order of the AR model. By the “order of the model” we mean the most extreme lag of x that is used as a predictor.

Summary: For AR models, the ACF will dampen exponentially and the PACF plot will be used to identify the order (p) of the AR model. For MA models, the PACF will dampen exponentially and the ACF plot will be used to identify the order (q) of the MA model.

ARMA Models

ARMA models (including both AR and MA terms) have ACFs and PACFs that both tail off to 0. These are the trickiest because the order will not be particularly obvious. Basically you just have to guess that one or two terms of each type may be needed and then see what happens when you estimate the model.

SHAPEINDICATED MODEL
Exponential, decaying to zeroAutoregressive model. Use the partial autocorrelation plot to identify the order of the autoregressive model.
Alternating positive and negative, decaying to zeroAutoregressive model. Use the partial autocorrelation plot to help identify the order.
One or more spikes, rest are essentially zeroMoving average model, order identified by where plot becomes zero.
Decay, starting after a few lagsMixed autoregressive and moving average model.
All zero or close to zeroData is essentially random.
High values at fixed intervalsInclude seasonal autoregressive term.
No decay to zeroSeries is not stationary.

Difficulty in Identifying Mixed Model

In practice, the autocorrelation and partial autocorrelation functions may not give a very clear picture as suggested in theory. Sometimes both MA and AR model will seem okay. In such a case, we can conduct a few more tests to identify the best suited model.

  • Analyze the standard errors of forecasted values. We should forecast with both models and analyze the standard errors of the forecasted values. Finally pickup the model which has lowest standard errors for predictions.
  • Compare AIC and BIC statistics: We should compare the models w.r.t. statistics such as AIC (Akaike Information Criterion) and BIC. The model with least values for these statistics should be selected.

Related Downloads

Membership
Learn the skills required to excel in data science and data analytics covering R, Python, machine learning, and AI.
I WANT TO JOIN
JOIN 30,000 DATA PROFESSIONALS

Free Guides - Getting Started with R and Python

Enter your name and email address below and we will email you the guides for R programming and Python.

Saylient AI Logo

Take the Next Step in Your Data Career

Join our membership for lifetime unlimited access to all our data analytics and data science learning content and resources.