fit |>forecast(h =30) |>autoplot(huron) +labs(y ="feet")
It seems unlikely that there was an increasing trend from 1973 to 2002, but the prediction intervals are very wide so they probably capture the actual values. Historical data on the level of Lake Huron can be obtained from the NOAA.
fpp3 10.7, Ex 2
Repeat Exercise 4 from Section 7.10, but this time adding in ARIMA errors to address the autocorrelations in the residuals.
How much difference does the ARIMA error process make to the regression coefficients?
Check the residuals of the fitted model to ensure the ARIMA process has adequately addressed the autocorrelations seen in the TSLM model.
fit |>select(dynreg) |>gg_tsresiduals()
These look fine.
fpp3 10.7, Ex 4
This exercise concerns aus_accommodation: the total quarterly takings from accommodation and the room occupancy level for hotels, motels, and guest houses in Australia, between January 1998 and June 2016. Total quarterly takings are in millions of Australian dollars. a. Compute the CPI-adjusted takings and plot the result for each state
For each state, fit a dynamic regression model of CPI-adjusted takings with seasonal dummy variables, a piecewise linear time trend with one knot at 2008 Q1, and ARIMA errors.
fit <- aus_accommodation |>model(ARIMA(adjTakings ~season() +trend(knot =yearquarter("2008 Q1"))) )fit
# A mable: 8 x 2
# Key: State [8]
State ARIMA(adjTakings ~ season() + trend(knot = year…¹
<chr> <model>
1 Australian Capital Territory <LM w/ ARIMA(1,0,0) errors>
2 New South Wales <LM w/ ARIMA(1,0,0)(0,0,1)[4] errors>
3 Northern Territory <LM w/ ARIMA(0,0,1)(1,0,0)[4] errors>
4 Queensland <LM w/ ARIMA(1,0,0)(0,0,1)[4] errors>
5 South Australia <LM w/ ARIMA(1,0,0)(1,0,0)[4] errors>
6 Tasmania <LM w/ ARIMA(0,0,1)(1,0,0)[4] errors>
7 Victoria <LM w/ ARIMA(1,0,0)(0,0,1)[4] errors>
8 Western Australia <LM w/ ARIMA(1,0,0) errors>
# ℹ abbreviated name:
# ¹`ARIMA(adjTakings ~ season() + trend(knot = yearquarter("2008 Q1")))`
The seasonal dummy variable has not adequately handled the seasonality, so there are seasonal ARIMA components.
Check that the residuals of the model look like white noise.
fit |>filter(State =="Victoria") |>gg_tsresiduals()
No apparent problems. Similar plots needed for the other states.
Forecast the takings for each state to the end of 2017. (Hint: You will need to produce forecasts of the CPI first.)
# A fable: 48 x 7 [1Q]
# Key: State, .model [8]
State .model Date adjTakings .mean CPI Takings
<chr> <chr> <qtr> <dist> <dbl> <dbl> <dist>
1 Australian Capital Terr… "ARIM… 2016 Q3 N(62, 9.9) 61.6 109. N(67, 12)
2 Australian Capital Terr… "ARIM… 2016 Q4 N(59, 12) 58.9 110. N(65, 15)
3 Australian Capital Terr… "ARIM… 2017 Q1 N(59, 13) 59.0 110. N(65, 16)
4 Australian Capital Terr… "ARIM… 2017 Q2 N(59, 13) 59.4 111. N(66, 16)
5 Australian Capital Terr… "ARIM… 2017 Q3 N(61, 13) 60.9 111. N(68, 16)
6 Australian Capital Terr… "ARIM… 2017 Q4 N(59, 13) 58.8 112. N(66, 16)
7 New South Wales "ARIM… 2016 Q3 N(791, 1254) 791. 109. N(863, 1494)
8 New South Wales "ARIM… 2016 Q4 N(844, 1589) 844. 110. N(926, 1914)
9 New South Wales "ARIM… 2017 Q1 N(829, 1679) 829. 110. N(915, 2043)
10 New South Wales "ARIM… 2017 Q2 N(734, 1703) 734. 111. N(814, 2094)
# ℹ 38 more rows
What sources of uncertainty have not been taken into account in the prediction intervals?
The uncertainty in the CPI forecasts has been ignored.
As usual, the estimation of the parameters and the choice of models have also not been accounted for.
fpp3 10.7, Ex 5
We fitted a harmonic regression model to part of the us_gasoline series in Exercise 6 in Section 7.10. We will now revisit this model, and extend it to include more data and ARMA errors.
Using TSLM(), fit a harmonic regression with a piecewise linear time trend to the full gasoline series. Select the position of the knots in the trend and the appropriate number of Fourier terms to include by minimising the AICc or CV value.
Let’s optimize using 2 break points and an unknown number of Fourier terms. Because the number of Fourier terms is integer, we can’t just use optim. Instead, we will loop over a large number of possible values for the breakpoints and Fourier terms. There are more than 2000 models fitted here, but TSLM is relatively fast.
Note that the possible values of the knots must be restricted so that knot2 is always much larger than knot1. We have set them to be at least 2 years apart here.
us_gasoline |>autoplot(Barrels)
# Function to compute CV given K and knots.get_cv <-function(K, knot1, knot2) { us_gasoline |>model(TSLM(Barrels ~fourier(K = K) +trend(c(knot1, knot2)))) |>glance() |>pull(CV)}models <-expand.grid(K =seq(25),knot1 =yearweek(as.character(seq(1991, 2017, 2))),knot2 =yearweek(as.character(seq(1991, 2017, 2)))) |>filter(knot2 - knot1 >104) |>as_tibble()models <- models |>mutate(cv = purrr::pmap_dbl(models, get_cv)) |>arrange(cv)# Best combination(best <-head(models, 1))
Check the residuals of the final model using the gg_tsdisplay() function and a Ljung-Box test. Do they look sufficiently like white noise to continue? If not, try modifying your model, or removing the first few years of data.
gg_tsresiduals(fit)
augment(fit) |>features(.innov, ljung_box, dof =2, lag =26)
Usually, we choose lag to be twice the seasonal period, but the seasonal period here is about 52, and 104 lags is too many for the test statistics to have good properties. So we’ve set lag to be 26 which should be plenty.
The model looks pretty good, and passes the Ljung-Box test, although there is some heteroskedasticity.
Once you have a model with white noise residuals, produce forecasts for the next year.
fit |>forecast(h ="1 year") |>autoplot(us_gasoline)