|> autoplot(Gas) aus_production
Exercise solutions: Section 8.8
fpp3 8.8, Ex7
Find an ETS model for the Gas data from
aus_production
and forecast the next few years. Why is multiplicative seasonality necessary here? Experiment with making the trend damped. Does it improve the forecasts?
- There is a huge increase in variance as the series increases in level. => That makes it necessary to use multiplicative seasonality.
<- aus_production |>
fit model(
hw = ETS(Gas ~ error("M") + trend("A") + season("M")),
hwdamped = ETS(Gas ~ error("M") + trend("Ad") + season("M")),
)
|> glance() fit
# A tibble: 2 × 9
.model sigma2 log_lik AIC AICc BIC MSE AMSE MAE
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 hw 0.00324 -831. 1681. 1682. 1711. 21.1 32.2 0.0413
2 hwdamped 0.00329 -832. 1684. 1685. 1718. 21.1 32.0 0.0417
- The non-damped model seems to be doing slightly better here, probably because the trend is very strong over most of the historical data.
|>
fit select(hw) |>
gg_tsresiduals()
|> tidy() fit
# A tibble: 19 × 3
.model term estimate
<chr> <chr> <dbl>
1 hw alpha 0.653
2 hw beta 0.144
3 hw gamma 0.0978
4 hw l[0] 5.95
5 hw b[0] 0.0706
6 hw s[0] 0.931
7 hw s[-1] 1.18
8 hw s[-2] 1.07
9 hw s[-3] 0.816
10 hwdamped alpha 0.649
11 hwdamped beta 0.155
12 hwdamped gamma 0.0937
13 hwdamped phi 0.980
14 hwdamped l[0] 5.86
15 hwdamped b[0] 0.0994
16 hwdamped s[0] 0.928
17 hwdamped s[-1] 1.18
18 hwdamped s[-2] 1.08
19 hwdamped s[-3] 0.817
|>
fit augment() |>
filter(.model == "hw") |>
features(.innov, ljung_box, lag = 24)
# A tibble: 1 × 3
.model lb_stat lb_pvalue
<chr> <dbl> <dbl>
1 hw 57.1 0.000161
- There is still some small correlations left in the residuals, showing the model has not fully captured the available information.
- There also appears to be some heteroskedasticity in the residuals with larger variance in the first half the series.
|>
fit forecast(h = 36) |>
filter(.model == "hw") |>
autoplot(aus_production)
While the point forecasts look ok, the intervals are excessively wide.
fpp3 8.8, Ex11
For this exercise use the quarterly number of arrivals to Australia from New Zealand, 1981 Q1 – 2012 Q3, from data set
aus_arrivals
.
- Make a time plot of your data and describe the main features of the series.
<- aus_arrivals |> filter(Origin == "NZ")
nzarrivals |> autoplot(Arrivals / 1e3) + labs(y = "Thousands of people") nzarrivals
- The data has an upward trend.
- The data has a seasonal pattern which increases in size approximately proportionally to the average number of people who arrive per year. Therefore, the data has multiplicative seasonality.
- Create a training set that withholds the last two years of available data. Forecast the test set using an appropriate model for Holt-Winters’ multiplicative method.
<- nzarrivals |>
nz_tr slice(1:(n() - 8))
|>
nz_tr model(ETS(Arrivals ~ error("M") + trend("A") + season("M"))) |>
forecast(h = "2 years") |>
autoplot() +
autolayer(nzarrivals, Arrivals)
- Why is multiplicative seasonality necessary here?
- The multiplicative seasonality is important in this example because the seasonal pattern increases in size proportionally to the level of the series.
- The behaviour of the seasonal pattern will be captured and projected in a model with multiplicative seasonality.
- Forecast the two-year test set using each of the following methods:
- an ETS model;
- an additive ETS model applied to a log transformed series;
- a seasonal naïve method;
- an STL decomposition applied to the log transformed data followed by an ETS model applied to the seasonally adjusted (transformed) data.
<- nz_tr |>
fc model(
ets = ETS(Arrivals),
log_ets = ETS(log(Arrivals)),
snaive = SNAIVE(Arrivals),
stl = decomposition_model(STL(log(Arrivals)), ETS(season_adjust))
|>
) forecast(h = "2 years")
|>
fc autoplot(level = NULL) +
autolayer(filter(nzarrivals, year(Quarter) > 2000), Arrivals)
|>
fc autoplot(level = NULL) +
autolayer(nzarrivals, Arrivals)
- Which method gives the best forecasts? Does it pass the residual tests?
|>
fc accuracy(nzarrivals)
# A tibble: 4 × 11
.model Origin .type ME RMSE MAE MPE MAPE MASE RMSSE ACF1
<chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 ets NZ Test -3495. 14913. 11421. -0.964 3.78 0.768 0.771 -0.0260
2 log_ets NZ Test 2467. 13342. 11904. 1.03 4.03 0.800 0.689 -0.0786
3 snaive NZ Test 9709. 18051. 17156. 3.44 5.80 1.15 0.933 -0.239
4 stl NZ Test -12535. 22723. 16172. -4.02 5.23 1.09 1.17 0.109
- The best method is the ETS model on the logged data (based on RMSE), and it passes the residuals tests.
<- nz_tr |>
log_ets model(ETS(log(Arrivals)))
|> gg_tsresiduals() log_ets
augment(log_ets) |>
features(.innov, ljung_box, lag = 12)
# A tibble: 1 × 4
Origin .model lb_stat lb_pvalue
<chr> <chr> <dbl> <dbl>
1 NZ ETS(log(Arrivals)) 11.0 0.530
- Compare the same four methods using time series cross-validation instead of using a training and test set. Do you come to the same conclusions?
<- nzarrivals |>
nz_cv slice(1:(n() - 3)) |>
stretch_tsibble(.init = 36, .step = 3)
|>
nz_cv model(
ets = ETS(Arrivals),
log_ets = ETS(log(Arrivals)),
snaive = SNAIVE(Arrivals),
stl = decomposition_model(STL(log(Arrivals)), ETS(season_adjust))
|>
) forecast(h = 3) |>
accuracy(nzarrivals)
# A tibble: 4 × 11
.model Origin .type ME RMSE MAE MPE MAPE MASE RMSSE ACF1
<chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 ets NZ Test 4627. 15327. 11799. 2.23 6.45 0.793 0.797 0.283
2 log_ets NZ Test 4388. 15047. 11566. 1.99 6.36 0.778 0.782 0.268
3 snaive NZ Test 8244. 18768. 14422. 3.83 7.76 0.970 0.976 0.566
4 stl NZ Test 4252. 15618. 11873. 2.04 6.25 0.798 0.812 0.244
- An initial fold size (
.init
) of 36 has been selected to ensure that sufficient data is available to make reasonable forecasts. - A step size of 3 (and forecast horizon of 3) has been used to reduce the computation time.
- The ETS model on the log data still appears best (based on 3-step ahead forecast RMSE).
fpp3 8.8, Ex14
- Use
ETS()
to select an appropriate model for the following series: total number of trips across Australia usingtourism
, the closing prices for the four stocks ingafa_stock
, and the lynx series inpelt
. Does it always give good forecasts?
tourism
<- tourism |>
aus_trips summarise(Trips = sum(Trips))
|>
aus_trips model(ETS(Trips)) |>
report()
Series: Trips
Model: ETS(A,A,A)
Smoothing parameters:
alpha = 0.4495675
beta = 0.04450178
gamma = 0.0001000075
Initial states:
l[0] b[0] s[0] s[-1] s[-2] s[-3]
21689.64 -58.46946 -125.8548 -816.3416 -324.5553 1266.752
sigma^2: 699901.4
AIC AICc BIC
1436.829 1439.400 1458.267
|>
aus_trips model(ETS(Trips)) |>
forecast() |>
autoplot(aus_trips)
Forecasts appear reasonable.
GAFA stock
<- gafa_stock |>
gafa_regular group_by(Symbol) |>
mutate(trading_day = row_number()) |>
ungroup() |>
as_tsibble(index = trading_day, regular = TRUE)
|> autoplot(Close) gafa_stock
|>
gafa_regular model(ETS(Close))
# A mable: 4 x 2
# Key: Symbol [4]
Symbol `ETS(Close)`
<chr> <model>
1 AAPL <ETS(M,N,N)>
2 AMZN <ETS(M,N,N)>
3 FB <ETS(M,N,N)>
4 GOOG <ETS(M,N,N)>
|>
gafa_regular model(ETS(Close)) |>
forecast(h = 50) |>
autoplot(gafa_regular |> group_by_key() |> slice((n() - 100):n()))
`mutate_if()` ignored the following grouping variables:
• Column `Symbol`
Forecasts look reasonable for an efficient market.
Pelt trading records
|>
pelt model(ETS(Lynx))
# A mable: 1 x 1
`ETS(Lynx)`
<model>
1 <ETS(A,N,N)>
|>
pelt model(ETS(Lynx)) |>
forecast(h = 10) |>
autoplot(pelt)
- Here the cyclic behaviour of the lynx data is completely lost.
- ETS models are not designed to handle cyclic data, so there is nothing that can be done to improve this.
- Find an example where it does not work well. Can you figure out why?
- ETS does not work well on cyclic data, as seen in the pelt dataset above.