Created date
Jun 16, 2022 01:21 PM
Data Science
Applied forecasting
ARIMA models are based on the autocorrelation in the data. It composes 3 parts: AR-I-MA.
  • AR: Autoregressive models
  • I: Stationarity and unit root
  • MA: Moving average models

(Integrated part) Stationarity and differencing

A stationary time series is the one whose properties DO NOT depend on the time.
  • If a time series has a trend or seasonality, it is non­stationary.
  • A white noise series is stationary.
To apply ARIMA models, our time series should be stationary to begin with.
notion image

Here is what it looks like :
  • Roughly horizontal
  • Constant variance
    • (Like some mean between two patterns in a series)
  • No patterns predictable in the long-term
    • (Can use Histogram to check the distribution. If same, this is stationary.)
notion image

So here asks 2 important questions :
  • How to test if a time series is stationary?
  • What to do if a time series is non­stationary?

(Integrated part) Q1: How to test if a time series is stationary?

Here are some examples :
Here is tricky
This seems like there is seasonality, but it is rather cyclic.
notion image
It's cyclic in the sense that there is not a fixed period and the time between the peaks or the troughs is not determined by the calendar; it's determined by the ecology of links, and their population cycle.
So, this one is actually stationary, even though you might originally think it's not.
If i took a section of the graph of some length s and i took another section at a completely different point in time where the starting point is randomly chosen say over here then the distribution is the same
notion image
Or we can look at the graph of ACF to determine if the ts is stationary.
notion image
Also, the values drop to 0 quickly - a sign of stationarity.
Also, the values drop to 0 quickly - a sign of stationarity.

(Integrated part) Q2: What to do if a time series is non­stationary?

We will do transformation if is not non­stationary.
  1. We do log, box-cox, or whatever suitable (taught previously). →
  1. We then differencing →
    1. 12 here if that is yearly seasonal.
  1. If needed, we will do a second differencing.
    1. In practice, we never to beyond the second-­order difference.
    2. Seasonal difference:
      It is the difference between an observation at time t and the previous observation from the same season.
      The above formula assumes m is 12.
notion image

Unit root test - objectively determines the need for differencing

There are multiple tests which we can use to test if the ts is stationary
1. ACF
The fact that in ACF, the spikes go to zero quickly suggests that the series is stationary. It is not white-noise though.
notion image
2. The Ljung­Box test → 細係 non-stationary
A small p-value implies the series is white noise, and hence non-stationary.
Box.test(type = "Ljung-Box")
Box.test(type = "Ljung-Box")
Box.test(type = "Ljung-Box")
Box.test(type = "Ljung-Box")
3. The augmented Dickey-Fuller (ADF) test → 細係 stationary
A small p-value implies the series = white noise, and hence stationary.
Null hypothesis = the data are non-stationary and non-seasonal
test phi
test phi
notion image
notion image
4. the Kwiatkowski-Phillips-Schmidt-Shin (KPSS) test → 細係 stationary
Null hypothesis = the data are stationary; we look for evidence that the null hypothesis is false.
Small p-values (e.g., less than 0.05) suggest that differencing i required.
notion image
notion image
notion image
5. STL decomposition strength → >0.64 係 non-stationary
notion image

Non-seasonal ARIMA models

(AR part) Autoregressive model

notion image
You might think that this looks like linear regression, and indeed it is the extension. The difference is that the regression one has a bunch of explanatory variables; however, in the AR (p) model, we are going to regress on its own lag values.
Q: Why use a univariant model?
  • Other explanatory variables are not available.
  • Other explanatory variables are not directly observable.
  • Examples: inflation rate, unemployment rate, exchange rate, firm's sales, gold prices, interest rate, etc.
  • Changing the parameters = changes ts patterns.
  • is white noise. Chaning only change the scale of the series, not the patterns.
  • If we add C, then we assumed the trend continues in long term.

Stationarity condition

P is the order of model
We normally restrict autoregressive models to stationary data, in which case some constraints on the values of the parameters are required.
  • For an AR(1) model: 
  • For an AR(2) model: 
When p≥3
p≥3, the restrictions are much more complicated. The Fablepackage takes care of these restrictions when estimating a model.


(MA part) Moving Average (MA) models

notion image
  • Moving Average (MA) models ≠ moving average smoothing!
  • This is a multiple regression with past errors as predictors.

Non-seasonal ARIMA models

Combining differencing with autoregression and a moving average model, we obtain a non-seasonal ARIMA model.
notion image
  • = the differenced series (it may have been differenced more than once).
  • The “predictors” on the right hand side include both lagged values of and lagged errors.
Params of ARIMA(p,d,q)
notion image
Special case
notion image

Q: How do you choose P and Q?
  • Before answering this q, we need to know how C, P and Q affect the model.
  • Changing d affects the prediction interval; The higher the value of d, the more rapidly the prediction intervals increase in size (d越大,pi 越大).

notion image
The above shows the ACF. However, the problem with the ACF function is that, when we calculate the correlation between in the case that they are corr, then y t − 1 and y t − 2 must also be correlated.
However, then y t and y t − 2 might be correlated, simply because they are both connected to y t − 1, rather than because of any new information contained in y t − 2 that could be used in forecasting y t .
In short, there is an interaction effect.
notion image

These measure these relationships after removing the effects of lags.
notion image
Q: How to pick the order of AR(p) using ACF vs. PACF?
notion image
Q: How to pick the order of MA(q) using ACF vs. PACF?
notion image

notion image

Seasonal ARIMA models

notion image
notion image
It works similar to non-seasonal ARIMA. But it adds the seasonal order terms written as (P, D, Q)
Q: How do you choose the order using ACF /PACF
notion image
notion image
notion image
notion image

Estimation and order selection

ARIMA modelling in R


Seasonal ARIMA models






notion image
Why is the mean positive in this case? (Tute 10 1:06:15v ) (Here) Ex15
notion image
A good model contains an unbiased residual. How do we know if our residuals are unbiased?
our residuals are unbiased when it is =0. If the above plot floats around 0, then it would be unbiased and that also means it is a white noise series
What is the process to make data stationary ?
1. Transform the data to remove changing variance
2. Seasonally difference the data to remove seasonality
3. Regular difference if the data is still non-stationary
“ARIMA-based prediction tends to be narrow.” True or False?
Yes, because only the variation in the errors has been accounted for.
  • There is also variation in the parameter estimates, and in the model order, that has not been included in the calculation.
  • The calculation assumes that the historical patterns that have been modelled will continue into the forecast period.


Maximum Likelihood Estimate (MLE)Autopilot: The Mind’s Three Favorite Options