# ARIMA Modelling - Identify Model for a Time Series

The first step is to identify a possible model for a given time series. To do so, we need three things: a time series plot of the data, ACF plot and the ACF plot. Analysis of these three plots can help us fairly identify the suitable model.

### Observing the Time Series Plot

- The very first thing that we should do with our data is to create a time series plot and look for patterns such as trends, seasonality, outliers, and non-constant variance.
- If we spot an obvious uptrend or down trend, we may need to consider first order differencing. If the time series exhibits a quadratic trend we may want to go for 2nd order differencing.
- Generally, we don't go beyond two levels of differencing. We don't want to introduce overdifferencing.
- If the series is still not stationary after two levels of differencing, then we may want to go for smoothing.
- If the data exhibits a curved upward trend along with non-constant variance such as increasing variance, we can transform the data series by either a logarithm or a square root. Data series with non-constant variance may also call for more advanced models such as ARCH (Not covered in this course).

### ACF and PACF Plots

We should consider ACF and PACF plots together to identify the order (i.e., the p and q) of the autoregressive and moving average terms.

**Important Note:** If the ACF and PACF do not tail off, but instead have values that stay close to 1 over many lags, the series is non-stationary and differencing will be needed. Try a first difference and then look at the ACF and PACF of the differenced data.

### Identification of an MA model is often best done with the ACF rather than the PACF.

- For an MA model, the theoretical PACF does not shut off, but instead tapers toward 0 in some manner. A clearer pattern for an MA model is in the ACF. The ACF will have non-zero autocorrelations only at lags involved in the model.
- A sample ACF with a significant autocorrelation only at lag 1 is an indicator of a possible MA(1) model.
- A sample ACF with significant autocorrelations at lags 1 and 2, but non-significant autocorrelations for higher lags indicates a possible MA(2) model.
- A property of MA(q) models in general is that there are non-zero autocorrelations for the first q lags and autocorrelations = 0 for all lags > q.

So, to identify the order of moving average process, we examine the sample autocorrelation function to see where it essentially becomes zero.

### Identification of an AR model is often best done with the PACF.

For an AR model, the theoretical PACF “shuts off” past the order of the model. The phrase “shuts off” means that in theory the partial autocorrelations are equal to 0 beyond that point. Put another way, the number of non-zero partial autocorrelations gives the order of the AR model. By the “order of the model” we mean the most extreme lag of x that is used as a predictor.

**Summary:** For AR models, the ACF will dampen exponentially and the PACF plot will be used to identify the order (p) of the AR model. For MA models, the PACF will dampen exponentially and the ACF plot will be used to identify the order (q) of the MA model.

### ARMA Models

ARMA models (including both AR and MA terms) have ACFs and PACFs that both tail off to 0. These are the trickiest because the order will not be particularly obvious. Basically you just have to guess that one or two terms of each type may be needed and then see what happens when you estimate the model.

SHAPE | INDICATED MODEL |
---|---|

Exponential, decaying to zero | Autoregressive model. Use the partial autocorrelation plot to identify the order of the autoregressive model. |

Alternating positive and negative, decaying to zero | Autoregressive model. Use the partial autocorrelation plot to help identify the order. |

One or more spikes, rest are essentially zero | Moving average model, order identified by where plot becomes zero. |

Decay, starting after a few lags | Mixed autoregressive and moving average model. |

All zero or close to zero | Data is essentially random. |

High values at fixed intervals | Include seasonal autoregressive term. |

No decay to zero | Series is not stationary. |

### Difficulty in Identifying Mixed Model

In practice, the autocorrelation and partial autocorrelation functions may not give a very clear picture as suggested in theory. Sometimes both MA and AR model will seem okay. In such a case, we can conduct a few more tests to identify the best suited model.

- Analyze the standard errors of forecasted values. We should forecast with both models and analyze the standard errors of the forecasted values. Finally pickup the model which has lowest standard errors for predictions.
- Compare AIC and BIC statistics: We should compare the models w.r.t. statistics such as AIC (Akaike Information Criterion) and BIC. The model with least values for these statistics should be selected.

#### Course Downloads

Get smart about tech at work.

As a non-technical professional, learn how software works with simple explanations of tech concepts. Learn more...

- Financial Time Series Data
- Exploring Time Series Data in R
- Plotting Time Series in R
- Handling Missing Values in Time Series
- Creating a Time Series Object in R
- Check if an object is a time series object in R
- Plotting Financial Time Series Data (Multiple Columns) in R
- Characteristics of Time Series
- Stationary Process in Time Series
- Transforming a Series to Stationary
- Time Series Transformation in R
- Differencing and Log Transformation
- Autocorrelation in R
- Time Series Models
- ARIMA Modeling
- Simulate White Noise (WN) in R
- Simulate Random Walk (RW) in R
- AutoRegressive (AR) Model in R
- Estimating AutoRegressive (AR) Model in R
- Forecasting with AutoRegressive (AR) Model in R
- Moving Average (MA) Model in R
- Estimating Moving Average (MA) Model in R
- ARIMA Modelling in R
- ARIMA Modelling - Identify Model for a Time Series
- Forecasting with ARIMA Modeling in R - Case Study
- Automatic Identification of Model Using auto.arima() Function in R
- Financial Time Series in R - Course Conclusion

# Data Science for Finance Bundle: 43% OFF

**Data Science for Finance Bundle**for just $29 $51.