Simple forecasting methods

type

Post

Created date

Jun 16, 2022 01:21 PM

5.2. Some simple forecasting methods. 2

The following 4 forecasting methods that we will use are benchmarks for other forecasting methods. They are very simple and surprisingly effective.

MEAN(y): Average method

NAIVE(y): Naïve method

SNAIVE(y ~ lag(m)): Seasonal naïve method

RW(y ~ drift()): Drift method

Above, Naive assumes the most recent observation is the most important one, and all previous obs provides no information about the future.

MEAN(y): Average method

Forecast of all future values = mean of historical data {y1, . . . , yT}.

SNAIVE(y ~ lag(m)): Seasonal naïve method

Forecasts = last value from same period a season ago.

You take the last m of observed data; in this case, this is quarterly data so m = 4. So you take the last 4 values, and then your future values are the same as those ones for all future years

NAIVE(y) : Naïve method

Forecasts = last observed value.

RW(y ~ drift()) : Drift method

Forecasts = last value plus average change from period to period

You may ask what's the average amount it's changed from period to period and that's what we expect it to continue to change into the future. So, you take your last value plus that amount of change for all future periods.

5.3. Residual diagnostics

5.2.2. Fitted values

Each observation in a time series can be forecast using all previous observations. We call these Fitted values.

5.2.3. Forecasting residuals

Define difference between observed value and its fitted value:

Useful in checking whether a model has adequately captured the information in the data.

A good forecasting method has the following assumptions and useful properties :

Assumptions of residuals

Useful properties

(for distributions & prediction intervals)

There are 2 ways to check residuals; one by treating residual individually (i.e. ACF of residuals) , another one by treating residuals as a group (Portmanteau tests)

5.2.4. ACF of residuals.

Interpretation

These graphs show that the naïve method produces forecasts that appear to account for all available information.

The mean of the residuals is close to zero and there is no significant correlation in the residuals series.

The time plot of the residuals shows that the variation of the residuals stays much the same across the historical data, apart from the one outlier, and therefore the residual variance can be treated as constant.

This can also be seen on the histogram of the residuals. The histogram suggests that the residuals may not be normal — the right tail seems a little too long, even when we ignore the outlier.

Consequently, forecasts from this method will probably be quite good, but prediction intervals that are computed assuming a normal distribution may be inaccurate.

Assume residuals are white noise (uncorrelated, mean zero, constant variance).

If they aren’t, then there is information left in the residuals that should be used in computing forecasts.

5.2.5. Portmanteau tests

A more formal test for autocorrelation by considering a whole set of values as a group

A test to see whether the set is significantly different from a zero set.

5.4. Distributional forecasts and prediction intervals

5.4.1. Forecast distributions

A forecast is (usually) the mean of the conditional distribution .

Most time series models produce normally distributed forecasts.

The forecast distribution describes the probability of observing any future value.

5.4.2. Prediction intervals

A prediction interval gives a region within which we expect to lie with a specified probability.

Assuming forecast errors are normally distributed, then a 95% PI is ; where is the st dev of the h-step distribution.

When h = 1, can be estimated from the residuals.

brick_fc %>% hilo(level = 95)

Point forecasts often useless without a measure of uncertainty (such as prediction intervals).

Prediction intervals require a stochastic model (with random errors, etc).

For most models, prediction intervals get wider as the forecast horizon increases.

Use level argument to control coverage. Check residual assumptions before believing them.

Usually too narrow due to unaccounted uncertainty.

5.5. Forecasting with transformations

5.5.1. Modelling with transformations

5.5.2. Forecasting with transformations

5.5.3. Bias adjustment

ETC3550 Lecture 5A - YouTube Here mentions :

If this probability is some number — p, then the probability of the transformation must also be that number p.

because the amount of probability of the amount of mass — the density mass — is going to be the same as whatever sits in here.

So. probabilities are preserved (i.e., identical), at least in terms of the quantiles of the distribution.

The mean is not the same, but the median is.

Taylor Series: Lecture starts here.

5.6. Forecasting and decomposition

Since we have learnt how to decompose the time series into 3 components (T= S+T_R), we now can first forecast the components and then combine them into one forecast.

Fit a decomposition model which involves both an STL decomposition followed by separate models for the seasonally adjusted series and the seasonal component.

When I produce forecasts of that, it's a forecast of the original series.

Under decomposition, model understands it's going to put these together at the end of the day
it looks to see what the model is for the seasonal component and what the model is for the adjusted component and adds them together to get forecasts of the original series
that's what it comes back with a forecast of the original series in the usual format the distribution and then the mean of the distribution


## use the function decomposition model
## 1
dcmp <- decomposition_model(
  STL(Employed),
  NAIVE(season_adjust),
  SNAIVE(season_year)
)
## 2
us_retail_employment %>% 
  model(stlf = dcmp) %>% 
  forecast()%>% 
  autoplot()

5.7. 7 Evaluating forecast accuracy

5.7.1. Training and test sets