Group Assignment 2: Suggested Solutions & Feedback
Question 1
What is your data about (no more than 50 words)? Produce appropriate plots in order to become familiar with your data. Make sure you label your axes and plots appropriately. Comment on these. What do you see? (no more than 50 words per plot). (14 marks)
Marks breakdown:
- Describe what the data is about (2m)
- Appropriate explanations/descriptions of: trend, seasonal, cycle, outlier in the plots that follow to get 4 marks each. Units must also be included.
- Time series plot - add title, y label (4m)
- Seasonal plot - add title, y label (4m)
- Subseries plot - add title, y lable (4m)
- Deduct marks if unnecessary graphs are included (-2)
- REMEMBER: Minus 2 overall if exceeding word limits.
Expectation:
- Explanation of what the data is about.
- Time plot and comment on the trend, seasonal and cyclic patterns, seasonal variation, and any other features of the time series.
- Seasonal plot and comment.
- Seasonal subseries plot and comment.
- Label the plots - correct x label, y label and title, including units.
Common errors:
- Fail to explain changing variation.
- Lag plot and ACF plots are included which is not necessary.
- Should include autoplot, seasonal plot, and subseries plots.
- Missing units of measurements either in the title or y label.
- Exceeding word limit.
Question 2
Would transforming your data be useful (no more than 50 words)? Choose a transformation and justify your choice (no more than 50 words). (5 marks)
Marks breakdown:
- Explain why transformation is/not needed (1m)
- Plot of the transformed data (2m)
- Justification of the final chosen lambda (2m)
- REMEMBER: Minus 1 overall if exceeding word limits.
Expectation:
- Plot the transformed data.
- Justify the choice.
Common errors:
If the data show variation that increases or decreases with the level of the series, then a transformation can be useful. Have a read of this section on the online textbook for a clear understanding of transformations. https://otexts.com/fpp3/transformations.html
Transformations simplify the patterns in the historical data by removing known sources of variation or by making the pattern more consistent across the whole data set. Simpler patterns usually lead to more accurate forecasts.
Also in some assignments the BoxCox with the Guerrero feature was chosen, but the value of the lambda was not mentioned.
BoxCox transformed data with lambda other than 1 could not have “$m” as units in the y label of the plots.
Question 3
Split your data into training and test sets. Leave the last two years’ worth of observations as the test set. Apply all four benchmark methods on the training set. Generate forecasts for the test set and plot them on the same graph. Compare their forecasting performance on the test set. Which method would you choose as an appropriate benchmark? Justify your answer (no more than 100 words). (Hint: it will be useful to tabulate your results.) (6 marks)
Marks breakdown:
- Test/ Training data (1m)
- Apply all benchmarks (1m) and plot forecasts on the same graph (1m)
- Tabulate results – use alternative error measures (1m)
- Comment on forecasts and results (1m)
- Choose appropriate benchmark (1m)
- REMEMBER: Minus 1 overall if exceeding word limits.
Common errors:
- Missing title on the graph.
- Inappropriate y-label.
- unable to comment on forecast and results
- unable to plot all methods in one graph Splitting your transformed data was not a mistake, if it was explained in the labels or heading.
Question 4
Can you construct better forecasting method(s) for your data? Evaluate the method(s) over the test set and compare them to the benchmarks. Give a brief description of these (no more than 50 words each). This question requires you to think about and only use tools you have acquired so far in this unit. Only materials from up to and including Chapter 5 can be used. (Notice that more weight is now placed on this part of the assignment) (14 marks)
Marks breakdown:
- Able to recognise existing problem and discuss (4m)
- Descriptions of new method (s). How the new method overcomes the existing problem (6m)
- Bonus (2m) if more than one method was applied.
- Present results for the new method (4m)
- REMEMBER: Minus 2 overall if exceeding word limits.
Expectation:
Most of the series include both trend and seasonality. However, none of the benchmark methods can capture both components in the data. Any reasonable approach for combining these two methods, or other practical approaches to improve forecasting performance were acceptable.
Common errors:
- Many students used methods not permitted by the question (such as exponential smoothing).
- When students could not invent a better method for their data, the justification of why the benchmark methods were ideal was commonly missing.
- No elaboration on the suggested new method.
- Some groups didn’t attempt to evaluate their proposed method or didn’t present any results for the method.
What is the existing problem? [-2m] Missing discussion on the existing problem.[-2m] Missing description for the new method [-6m] Missing results for the new method [-4m]
Question 5
Use time series cross-validation to select the best of the methods you have so far considered (you may only consider the most appropriate of the four benchmarks). Tabulate your results. How do these compare with the test set. Comment on the advantages and/or disadvantages of using this method (no more than 100 words). (10 marks)
Marks breakdown:
- Appropriate application of stretch_tsibble() or another version (4m)
- Tabulate results – ideally you want another column on the table above with CV results (3m).
- Comment on its advantages (use all/most/much more data than test set) and disadvantages (eg: slow) (3m) REMEMBER: Minus 1 overall if exceeding word limits.
Common errors: * Results were not tabulated.
Question 6
For the best method, do a residual analysis. Comment on these. What did your forecasting method miss? (no more than 150 words). (4 marks)
Mark breakdown: * Comment on the residuals plot– average residuals – approximately zero? (1m) * Comment on histogram for normality plot. Note that normality is only relevant for prediction interval. Student may make note on that. (1m) * Comment on the ACF. Discuss the spike, due to chance or significant spike? (1m) * Comment on the Ljung-Box test (1m) REMEMBER: Minus 1 overall if exceeding word limits.
Expectation: * Plot the residuals and comment – comment on the mean, variance and correlation. * Plot the histogram and comment – comment on the mean and the normality (Bell-shape curve itself does not imply the normality. Bell-shape curve around zero and without long tails implies the residuals follow a standard normal distribution.) * Plot the ACF and comment. * Do a Ljung-Box test and comment.
Common errors:
- Significant spikes decreasing slowly in the ACF indicate the trend is left in the residuals. Hence the forecasts missed the trend component in the data.
- Significant spikes at lags 12 and 24 (for monthly data) in the ACF indicate the seasonality is left in the residuals. Hence the forecasts missed the seasonal component in the data.
- Fail to state why there is autocorrelation. Fail to relate it to ACF or Ljung Box test.
Question 7
Generate forecasts for the next 2 years (future) from the method you have classified as best and plot them. Evaluate these visually and comment on the point and prediction intervals (no more than 100 words). (6 marks)
Marks Breakdown:
- Plot forecasts for h=24 on original scale. The x-axis and y-axis label must be appropriate, else deduct 1m. (2m)
- Comment on the point forecasts. (2m)
- Comment on the prediction intervals (2m)
- Need to think visualisation here (-2m) if not. REMEMBER: Minus 1 overall if exceeding word limits.
Expectation:
- Plot the forecasts as a continuation of the data.
- Plot the forecasts on the original scale.
- Adjust the x-axis limit to display all the point forecasts on the plot.
- Comment on the forecasts.
Common errors:
- Labelling issue
Question 8
Suppose that your group was a forecasting consulting group and you were working for a client. Give your group a name (add that to the subtitle above). Report your work here summarising and critically (think about assumption, limitations, extensions) evaluating the forecasts you are generating (no more than 200 words). (5 marks)
Mark breakdown:
- Name (1m).
- The rest is a little open ended. See how deeply each group thought about what they are doing. (4m).
REMEMBER: Minus 1 overall if exceeding word limits.
Generally good. Most tried to evaluate their models critically.