Activities: Week 1

Time series data

Tourism Example

The tourism dataset contains the quarterly overnight trips from 1998 Q1 to 2016 Q4 across Australia.

It is disaggregated by 3 key variables:

  • State: States and territories of Australia
  • Region: The tourism regions are formed through the aggregation of Statistical Local Areas (SLAs) which are defined by the various State and Territory tourism authorities according to their research and marketing needs
  • Purpose: Stopover purpose of visit: “Holiday”, “Visiting friends and relatives”, “Business”, “Other reason”.

The tsibble is shown below:

# A tsibble: 24,320 x 5 [1Q]
# Key:       Region, State, Purpose [304]
   Quarter Region   State           Purpose  Trips
     <qtr> <chr>    <chr>           <chr>    <dbl>
 1 1998 Q1 Adelaide South Australia Business  135.
 2 1998 Q2 Adelaide South Australia Business  110.
 3 1998 Q3 Adelaide South Australia Business  166.
 4 1998 Q4 Adelaide South Australia Business  127.
 5 1999 Q1 Adelaide South Australia Business  137.
 6 1999 Q2 Adelaide South Australia Business  200.
 7 1999 Q3 Adelaide South Australia Business  169.
 8 1999 Q4 Adelaide South Australia Business  134.
 9 2000 Q1 Adelaide South Australia Business  154.
10 2000 Q2 Adelaide South Australia Business  169.
# ℹ 24,310 more rows

Calculate the total quarterly tourists visiting Victoria from the tourism dataset.

NoteHint

To start off, filter the tourism dataset for only Victoria.

tourism |>
  filter(State == "Victoria")
NoteHint

After filtering, summarise the total trips for Victoria.

tourism |>
  filter(State == "Victoria") |>
  summarise(Trips = sum(Trips))

Find what combination of Region and Purpose had the maximum number of overnight trips on average.

NoteHint

Start by using as_tibble() to convert tourism back to a tibble and group it by Region and Purpose.

tourism |>
  as_tibble() |>
  group_by(Region, Purpose)
NoteHint

After grouping, summarise the mean number of trips and filter for maximum trips.

tourism |>
  as_tibble() |>
  group_by(Region, Purpose) |>
  summarise(Trips = mean(Trips), .groups = "drop") |>
  filter(Trips == max(Trips))

Create a new tsibble which combines the Purposes and Regions, and just has total trips by State.

NoteHint

To summarise the number of trips by each State, start by grouping the data by State.

tourism |>
  group_by(State)
NoteHint

After grouping, use the summarise() function to sum the trips.

tourism |>
  group_by(State) |>
  summarise(Trips = sum(Trips))