|>
global_economy filter(Country == "Australia") |>
autoplot(Population)
Exercise solutions: Section 5.11
fpp3 5.11, Ex 1
Produce forecasts for the following series using whichever of
NAIVE(y)
,SNAIVE(y)
orRW(y ~ drift())
is more appropriate in each case:
- Australian Population (
global_economy
)- Bricks (
aus_production
)- NSW Lambs (
aus_livestock
)- Household wealth (
hh_budget
)- Australian takeaway food turnover (
aus_retail
)
Australian population
Data has trend and no seasonality. Random walk with drift model is appropriate.
|>
global_economy filter(Country == "Australia") |>
model(RW(Population ~ drift())) |>
forecast(h = "10 years") |>
autoplot(global_economy)
Australian clay brick production
|>
aus_production filter(!is.na(Bricks)) |>
autoplot(Bricks) +
labs(title = "Clay brick production")
This data appears to have more seasonality than trend, so of the models available, seasonal naive is most appropriate.
|>
aus_production filter(!is.na(Bricks)) |>
model(SNAIVE(Bricks)) |>
forecast(h = "5 years") |>
autoplot(aus_production)
NSW Lambs
<- aus_livestock |>
nsw_lambs filter(State == "New South Wales", Animal == "Lambs")
|>
nsw_lambs autoplot(Count)
This data appears to have more seasonality than trend, so of the models available, seasonal naive is most appropriate.
|>
nsw_lambs model(SNAIVE(Count)) |>
forecast(h = "5 years") |>
autoplot(nsw_lambs)
Household wealth
|>
hh_budget autoplot(Wealth)
Annual data with trend upwards, so we can use a random walk with drift.
|>
hh_budget model(RW(Wealth ~ drift())) |>
forecast(h = "5 years") |>
autoplot(hh_budget)
Australian takeaway food turnover
<- aus_retail |>
takeaway filter(Industry == "Takeaway food services") |>
summarise(Turnover = sum(Turnover))
|> autoplot(Turnover) takeaway
This data has strong seasonality and strong trend, so we will use a seasonal naive model with drift.
|>
takeaway model(SNAIVE(Turnover ~ drift())) |>
forecast(h = "5 years") |>
autoplot(takeaway)
This is actually not one of the four benchmark methods discussed in the book, but is sometimes a useful benchmark when there is strong seasonality and strong trend.
The corresponding equation is \[ \hat{y}_{T+h|T} = y_{T+h-m(k+1)} + \frac{h}{T-m}\sum_{t=m+1}^T(y_t - y_{t-m}), \] where \(m=12\) and \(k\) is the integer part of \((h-1)/m\) (i.e., the number of complete years in the forecast period prior to time \(T+h\)).
fpp3 5.11, Ex 3
Apply a seasonal naïve method to the quarterly Australian beer production data from 1992. Check if the residuals look like white noise, and plot the forecasts. The following code will help.
# Extract data of interest
<- aus_production |>
recent_production filter(year(Quarter) >= 1992)
# Define and estimate a model
<- recent_production |> model(SNAIVE(Beer))
fit # Look at the residuals
|> gg_tsresiduals() fit
- The residuals are not centred around 0 (typically being slightly below it), this is due to the model failing to capture the negative trend in the data.
- Peaks and troughs in residuals spaced roughly 4 observations apart are apparent leading to a negative spike at lag 4 in the ACF. So they do not resemble white noise. Lags 1 and 3 are also significant, however they are very close to the threshold and are of little concern.
- The distribution of the residuals does not appear very normal, however it is probably close enough for the accuracy of our intervals (it being not centred on 0 is more concerning).
# Look at some forecasts
|>
fit forecast() |>
autoplot(recent_production)
The forecasts look reasonable, although the intervals may be a bit wide. This is likely due to the slight trend not captured by the model (which subsequently violates the assumptions imposed on the residuals).
fpp3 5.11, Ex 5
Produce forecasts for the 7 Victorian series in
aus_livestock
usingSNAIVE()
. Plot the resulting forecasts including the historical data. Is this a reasonable benchmark for these series?
|>
aus_livestock filter(State == "Victoria") |>
model(SNAIVE(Count)) |>
forecast(h = "5 years") |>
autoplot(aus_livestock)
- Most point forecasts look reasonable from the seasonal naive method.
- Some series are more seasonal than others, and for the series with very weak seasonality it may be better to consider using a naive or drift method.
- The prediction intervals in some cases go below zero, so perhaps a log transformation would have been better for these series.
fpp3 5.11, Ex 11
We will use the bricks data from
aus_production
(Australian quarterly clay brick production 1956–2005) for this exercise.
- Use an STL decomposition to calculate the trend-cycle and seasonal indices. (Experiment with having fixed or changing seasonality.)
<- aus_production |>
tidy_bricks filter(!is.na(Bricks))
|>
tidy_bricks model(STL(Bricks)) |>
components() |>
autoplot()
Data is multiplicative, and so a transformation should be used.
<- tidy_bricks |>
dcmp model(STL(log(Bricks))) |>
components()
|>
dcmp autoplot()
Seasonality varies slightly.
<- tidy_bricks |>
dcmp model(stl = STL(log(Bricks) ~ season(window = "periodic"))) |>
components()
|> autoplot() dcmp
The seasonality looks fairly stable, so I’ve used a periodic season (window). The decomposition still performs well when the seasonal component is fixed. The remainder term does not appear to contain a substantial amount of seasonality.
- Compute and plot the seasonally adjusted data.
|>
dcmp as_tsibble() |>
autoplot(season_adjust)
- Use a naïve method to produce forecasts of the seasonally adjusted data.
<- dcmp |>
fit select(-.model) |>
model(naive = NAIVE(season_adjust)) |>
forecast(h = "5 years")
|>
dcmp as_tsibble() |>
autoplot(season_adjust) + autolayer(fit)
- Use
decomposition_model()
to reseasonalise the results, giving forecasts for the original data.
<- tidy_bricks |>
fit model(stl_mdl = decomposition_model(STL(log(Bricks)), NAIVE(season_adjust)))
|>
fit forecast(h = "5 years") |>
autoplot(tidy_bricks)
- Do the residuals look uncorrelated?
|> gg_tsresiduals() fit
The residuals do not appear uncorrelated as there are several lags of the ACF which exceed the significance threshold.
- Repeat with a robust STL decomposition. Does it make much difference?
<- tidy_bricks |>
fit_robust model(stl_mdl = decomposition_model(STL(log(Bricks)), NAIVE(season_adjust)))
|> gg_tsresiduals() fit_robust
The residuals appear slightly less auto-correlated, however there is still significant auto-correlation at lag 8.
- Compare forecasts from
decomposition_model()
with those fromSNAIVE()
, using a test set comprising the last 2 years of data. Which is better?
<- tidy_bricks |>
tidy_bricks_train slice(1:(n() - 8))
<- tidy_bricks_train |>
fit model(
stl_mdl = decomposition_model(STL(log(Bricks)), NAIVE(season_adjust)),
snaive = SNAIVE(Bricks)
)
<- fit |>
fc forecast(h = "2 years")
|>
fc autoplot(tidy_bricks, level = NULL)
The decomposition forecasts appear to more closely follow the actual future data.
|>
fc accuracy(tidy_bricks)
# A tibble: 2 × 10
.model .type ME RMSE MAE MPE MAPE MASE RMSSE ACF1
<chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 snaive Test 2.75 20 18.2 0.395 4.52 0.504 0.407 -0.0503
2 stl_mdl Test 0.368 18.1 15.1 -0.0679 3.76 0.418 0.368 0.115
The STL decomposition forecasts are more accurate than the seasonal naive forecasts across all accuracy measures.