Exercise solutions: Section 2.10

Author

Rob J Hyndman and George Athanasopoulos

fpp3 2.10, Ex 6

The aus_arrivals data set comprises quarterly international arrivals (in thousands) to Australia from Japan, New Zealand, UK and the US. Use autoplot(), gg_season() and gg_subseries() to compare the differences between the arrivals from these four countries. Can you identify any unusual observations?

aus_arrivals |> autoplot(Arrivals)

Generally the number of arrivals to Australia is increasing over the entire series, with the exception of Japanese visitors which begin to decline after 1995. The series appear to have a seasonal pattern which varies proportionately to the number of arrivals. Interestingly, the number of visitors from NZ peaks sharply in 1988. The seasonal pattern from Japan appears to change substantially.

#! fig-height: 10
aus_arrivals |> gg_season(Arrivals, labels = "both")

The seasonal pattern of arrivals appears to vary between each country. In particular, arrivals from the UK appears to be lowest in Q2 and Q3, and increase substantially for Q4 and Q1. Whereas for NZ visitors, the lowest period of arrivals is in Q1, and highest in Q3. Similar variations can be seen for Japan and US.

aus_arrivals |> gg_subseries(Arrivals)

The subseries plot reveals more interesting features. It is evident that whilst the UK arrivals is increasing, most of this increase is seasonal. More arrivals are coming during Q1 and Q4, whilst the increase in Q2 and Q3 is less extreme. The growth in arrivals from NZ and US appears fairly similar across all quarters. There exists an unusual spike in arrivals from the US in 1992 Q3.

Unusual observations:

2000 Q3: Spikes from the US (Sydney Olympics arrivals)
2001 Q3-Q4 are unusual for US (9/11 effect)
1991 Q3 is unusual for the US (Gulf war effect?)

fpp3 2.10, Ex 7

Monthly Australian retail data is provided in aus_retail. Select one of the time series as follows (but choose your own seed value):
set.seed(12345678)
myseries <- aus_retail |>
  filter(`Series ID` == sample(aus_retail$`Series ID`,1))
Explore your chosen retail time series using the following functions:
autoplot(), gg_season(), gg_subseries(), gg_lag(), ACF() |> autoplot()

set.seed(12345678)
myseries <- aus_retail |>
  filter(`Series ID` == sample(aus_retail$`Series ID`,1))
myseries |>
  autoplot(Turnover) +
  labs(y = "Turnover (million $AUD)", x = "Time (Years)",
       title = myseries$Industry[1],
       subtitle = myseries$State[1])

The data features a non-linear upward trend and a strong seasonal pattern. The variability in the data appears proportional to the amount of turnover (level of the series) over the time period.

myseries |>
  gg_season(Turnover, labels = "both") +
  labs(y = "Turnover (million $AUD)",
       title = myseries$Industry[1],
       subtitle = myseries$State[1])

Strong seasonality is evident in the season plot. Large increases in clothing retailing can be observed in December (probably a Christmas effect). There is also a peak in July that appears to be getting stronger over time. 2016 had an unusual pattern in the first half of the year.

myseries |>
  gg_subseries(Turnover) +
  labs(y = "Turnover (million $AUD)", x="")

There is a strong trend in all months, with the largest trend in December and a larger increase in July and August than most other months.

myseries |>
  gg_lag(Turnover, lags=1:24, geom='point') + facet_wrap(~ .lag, ncol=6)

myseries |>
  ACF(Turnover, lag_max = 50) |>
  autoplot()

fpp3 2.10, Ex 11

Use the following code to compute the daily changes in Google closing stock prices.
dgoog <- gafa_stock |>
  filter(Symbol == "GOOG", year(Date) >= 2018) |>
  mutate(trading_day = row_number()) |>
  update_tsibble(index = trading_day, regular = TRUE) |>
  mutate(diff = difference(Close))
Why was it necessary to re-index the tsibble?

Plot these differences and their ACF.

Do the changes in the stock prices look like white noise?

dgoog <- gafa_stock |>
  filter(Symbol == "GOOG", year(Date) >= 2018) |>
  mutate(trading_day = row_number()) |>
  update_tsibble(index = trading_day, regular = TRUE) |>
  mutate(diff = difference(Close))

The tsibble needed re-indexing as trading happens irregularly. The new index is based only on trading days.

dgoog |> autoplot(diff)

dgoog |> ACF(diff, lag_max=100) |> autoplot()

There are some small significant autocorrelations out to lag 24, but nothing after that. Given the probability of a false positive is 5%, these look similar to white noise.