|> autoplot(Arrivals) aus_arrivals
Exercise solutions: Section 2.10
fpp3 2.10, Ex 6
The
aus_arrivals
data set comprises quarterly international arrivals (in thousands) to Australia from Japan, New Zealand, UK and the US. Useautoplot()
,gg_season()
andgg_subseries()
to compare the differences between the arrivals from these four countries. Can you identify any unusual observations?
Generally the number of arrivals to Australia is increasing over the entire series, with the exception of Japanese visitors which begin to decline after 1995. The series appear to have a seasonal pattern which varies proportionately to the number of arrivals. Interestingly, the number of visitors from NZ peaks sharply in 1988. The seasonal pattern from Japan appears to change substantially.
#! fig-height: 10
|> gg_season(Arrivals, labels = "both") aus_arrivals
The seasonal pattern of arrivals appears to vary between each country. In particular, arrivals from the UK appears to be lowest in Q2 and Q3, and increase substantially for Q4 and Q1. Whereas for NZ visitors, the lowest period of arrivals is in Q1, and highest in Q3. Similar variations can be seen for Japan and US.
|> gg_subseries(Arrivals) aus_arrivals
The subseries plot reveals more interesting features. It is evident that whilst the UK arrivals is increasing, most of this increase is seasonal. More arrivals are coming during Q1 and Q4, whilst the increase in Q2 and Q3 is less extreme. The growth in arrivals from NZ and US appears fairly similar across all quarters. There exists an unusual spike in arrivals from the US in 1992 Q3.
Unusual observations:
- 2000 Q3: Spikes from the US (Sydney Olympics arrivals)
- 2001 Q3-Q4 are unusual for US (9/11 effect)
- 1991 Q3 is unusual for the US (Gulf war effect?)
fpp3 2.10, Ex 7
Monthly Australian retail data is provided in aus_retail. Select one of the time series as follows (but choose your own seed value):
set.seed(12345678) <- aus_retail |> myseries filter(`Series ID` == sample(aus_retail$`Series ID`,1))
Explore your chosen retail time series using the following functions:
autoplot(), gg_season(), gg_subseries(), gg_lag(), ACF() |> autoplot()
set.seed(12345678)
<- aus_retail |>
myseries filter(`Series ID` == sample(aus_retail$`Series ID`,1))
|>
myseries autoplot(Turnover) +
labs(y = "Turnover (million $AUD)", x = "Time (Years)",
title = myseries$Industry[1],
subtitle = myseries$State[1])
The data features a non-linear upward trend and a strong seasonal pattern. The variability in the data appears proportional to the amount of turnover (level of the series) over the time period.
|>
myseries gg_season(Turnover, labels = "both") +
labs(y = "Turnover (million $AUD)",
title = myseries$Industry[1],
subtitle = myseries$State[1])
Strong seasonality is evident in the season plot. Large increases in clothing retailing can be observed in December (probably a Christmas effect). There is also a peak in July that appears to be getting stronger over time. 2016 had an unusual pattern in the first half of the year.
|>
myseries gg_subseries(Turnover) +
labs(y = "Turnover (million $AUD)", x="")
There is a strong trend in all months, with the largest trend in December and a larger increase in July and August than most other months.
|>
myseries gg_lag(Turnover, lags=1:24, geom='point') + facet_wrap(~ .lag, ncol=6)
|>
myseries ACF(Turnover, lag_max = 50) |>
autoplot()
fpp3 2.10, Ex 11
Use the following code to compute the daily changes in Google closing stock prices.
<- gafa_stock |> dgoog filter(Symbol == "GOOG", year(Date) >= 2018) |> mutate(trading_day = row_number()) |> update_tsibble(index = trading_day, regular = TRUE) |> mutate(diff = difference(Close))
Why was it necessary to re-index the tsibble?
Plot these differences and their ACF.
Do the changes in the stock prices look like white noise?
<- gafa_stock |>
dgoog filter(Symbol == "GOOG", year(Date) >= 2018) |>
mutate(trading_day = row_number()) |>
update_tsibble(index = trading_day, regular = TRUE) |>
mutate(diff = difference(Close))
The tsibble needed re-indexing as trading happens irregularly. The new index is based only on trading days.
|> autoplot(diff) dgoog
|> ACF(diff, lag_max=100) |> autoplot() dgoog
There are some small significant autocorrelations out to lag 24, but nothing after that. Given the probability of a false positive is 5%, these look similar to white noise.