Time Series and Forecasting

2025-09-30

Prerequisites

Required Packages

Let’s start by loading the necessary packages for our analysis:

# Load required packages
library(fpp3)      # For forecasting using tidyverts
library(tidyverse) # For data manipulation and visualization
library(lubridate) # For date handling

Additional Useful Packages

# Additional useful packages
library(seasonal)  # For seasonal adjustment
library(forecast)  # Traditional forecasting methods
library(tsibble)   # For time series tibbles

Part 1: Getting Started

What is Time Series?

Definition and Key Features

  • Time series: A sequence of observations taken sequentially in time
  • Each observation is associated with a particular time
  • Examples: stock prices, temperature readings, sales data, population counts
  • Key characteristic: temporal ordering matters

Components of Time Series

  • Trend: Long-term increase or decrease in data
  • Seasonality: Pattern that repeats over fixed periods
  • Cycles: Pattern that repeats over varying lengths
  • Irregular: Random fluctuations

Basic Time Series in R

Creating Time Series Objects

# Using ts() function
# ts_data <- ts(data, start = c(2020, 1), frequency = 12)

# Using tsibble for modern approach
library(tsibble)
tsbl_data <- tsibble(
  Month = yearmonth("2020 Jan") + 0:23,
  Sales = rnorm(24, 100, 10),
  index = Month
 )
tsbl_data
# A tsibble: 24 x 2 [1M]
      Month Sales
      <mth> <dbl>
 1 2020 Jan 114. 
 2 2020 Feb 101. 
 3 2020 Mar  90.5
 4 2020 Apr 105. 
 5 2020 May 110. 
 6 2020 Jun  96.1
 7 2020 Jul  71.3
 8 2020 Aug 101. 
 9 2020 Sep  96.1
10 2020 Oct 112. 
# ℹ 14 more rows

Example: Australian GDP

# Load built-in dataset
data(global_economy)
# Filter for Australia
aus_economy <- global_economy |>   filter(Country == "Australia")

# View the data structure
(aus_economy) |> slice_sample(n = 8)
# A tsibble: 8 x 9 [1Y]
# Key:       Country [1]
  Country   Code   Year     GDP Growth    CPI Imports Exports Population
  <fct>     <fct> <dbl>   <dbl>  <dbl>  <dbl>   <dbl>   <dbl>      <dbl>
1 Australia AUS    1986 1.82e11   4.06  44.5     18.1    15.0   16018400
2 Australia AUS    1964 2.38e10   6.98   8.40    13.8    14.9   11167000
3 Australia AUS    1981 1.77e11   3.34  30.0     16.7    14.9   14927000
4 Australia AUS    1990 3.11e11   3.56  59.8     17.1    15.1   17065100
5 Australia AUS    1996 4.00e11   3.88  69.4     19.4    18.9   18311000
6 Australia AUS    1973 6.37e10   2.61  12.5     11.0    14.2   13380000
7 Australia AUS    2016 1.21e12   2.83 113.      21.5    19.3   24210809
8 Australia AUS    1994 3.22e11   3.98  64.6     18.5    18.0   17855000

What is Forecasting?

Definition and Purpose

  • Forecasting: Predicting future values based on historical data
  • Goal: Make informed decisions about the future
  • Applications in public health:
    • Disease outbreak prediction
    • Healthcare resource planning
    • Mortality and morbidity projections

Types of Forecasting

  • Point forecasts: Single predicted value
  • Interval forecasts: Range of possible values
  • Probability forecasts: Probability distributions

Forecasting Steps

  1. Problem definition
  2. Gather information
  3. Preliminary analysis
  4. Choose and fit models
  5. Use and evaluate forecasting model

Part 2: Time Series Analysis

Time Series Graphics

Basic Time Plots

# Simple time plot
aus_economy |>
  autoplot(GDP) +
  labs(title = "Australian GDP",
       y = "GDP (billions USD)")

Seasonal Plots

# Seasonal plot for monthly data
aus_retail <- aus_retail |>
  filter(State == "Victoria", Industry == "Food retailing")

aus_retail |>
  gg_season(Turnover, labels = "both") +
  labs(y = "Turnover (millions AUD)",
       title = "Seasonal plot: Food retail sales")

Seasonal Subseries Plots

aus_retail |>
  gg_subseries(Turnover) +
  labs(y = "Turnover (millions AUD)",
       title = "Seasonal subseries plot")

Time Series Decomposition

Classical Decomposition

Additive model: \(y_t = T_t + S_t + E_t\)

Multiplicative model: \(y_t = T_t \times S_t \times E_t\)

Where:

  • \(y_t\) = observed value at time t
  • \(T_t\) = trend component
  • \(S_t\) = seasonal component
  • \(E_t\) = error/irregular component

STL Decomposition

# STL decomposition
aus_retail |>
  model(stl = STL(Turnover)) |>
  components() |>
  autoplot() +
  labs(title = "STL decomposition of retail sales")

Time Series Features

Measures of Strength

# Calculate features
aus_retail |>
  features(Turnover, list(
    trend_strength = feat_stl,
    seasonal_strength = feat_stl
  ))
# A tibble: 1 × 20
  State    Industry       trend_strength_trend_strength trend_strength_seasona…¹
  <chr>    <chr>                                  <dbl>                    <dbl>
1 Victoria Food retailing                         0.999                    0.959
# ℹ abbreviated name: ¹​trend_strength_seasonal_strength_year
# ℹ 16 more variables: trend_strength_seasonal_peak_year <dbl>,
#   trend_strength_seasonal_trough_year <dbl>, trend_strength_spikiness <dbl>,
#   trend_strength_linearity <dbl>, trend_strength_curvature <dbl>,
#   trend_strength_stl_e_acf1 <dbl>, trend_strength_stl_e_acf10 <dbl>,
#   seasonal_strength_trend_strength <dbl>,
#   seasonal_strength_seasonal_strength_year <dbl>, …

Autocorrelation Function (ACF)

Formula: \(r_k = \frac{\sum_{t=k+1}^T (y_t - \bar{y})(y_{t-k} - \bar{y})}{\sum_{t=1}^T (y_t - \bar{y})^2}\)

# ACF plot
aus_retail |>
  ACF(Turnover, lag_max = 24) |>
  autoplot() +
  labs(title = "Autocorrelation function")

The Forecaster’s Toolbox

Simple Methods

Mean Method

\(\hat{y}_{T+h|T} = \bar{y}\)

Naïve Method

\(\hat{y}_{T+h|T} = y_T\)

Seasonal Naïve

\(\hat{y}_{T+h|T} = y_{T+h-m(k+1)}\)

Drift Method

\(\hat{y}_{T+h|T} = y_T + \frac{h}{T-1}\sum_{t=2}^T (y_t - y_{t-1})\)

Implementation in R

# Fit simple models
fit <- aus_retail |>
  model(
    Mean = MEAN(Turnover),
    Naive = NAIVE(Turnover),
    Seasonal_naive = SNAIVE(Turnover),
    Drift = RW(Turnover ~ drift())
  )
# Generate forecasts
fc <- fit |> forecast(h = 12)

# Plot forecasts
fc |> autoplot(aus_retail)

Part 3: Advanced Methods

Exponential Smoothing

Simple Exponential Smoothing

Forecast equation: \(\hat{y}_{t+1|t} = \alpha y_t + \alpha(1-\alpha) y_{t-1} + \alpha(1-\alpha)^2 y_{t-2} + \cdots\)

Smoothing equation: \(\ell_t = \alpha y_t + (1-\alpha)\ell_{t-1}\)

Where \(0 \leq \alpha \leq 1\) is the smoothing parameter.

Holt’s Linear Trend

Forecast equation: \(\hat{y}_{t+h|t} = \ell_t + h b_t\)

Level equation: \(\ell_t = \alpha y_t + (1-\alpha)(\ell_{t-1} + b_{t-1})\)

Trend equation: \(b_t = \beta(\ell_t - \ell_{t-1}) + (1-\beta)b_{t-1}\)

Holt-Winters Seasonal

Additive seasonality: - \(\hat{y}_{t+h|t} = \ell_t + h b_t + s_{t+h-m(k+1)}\) - \(s_t = \gamma(y_t - \ell_{t-1} - b_{t-1}) + (1-\gamma)s_{t-m}\)

# Fit exponential smoothing models
fit_ets <- aus_retail |>
  model(ETS(Turnover))
# View model details
report(fit_ets)
Series: Turnover 
Model: ETS(M,A,M) 
  Smoothing parameters:
    alpha = 0.2611645 
    beta  = 0.005957878 
    gamma = 0.07008308 

  Initial states:
     l[0]     b[0]     s[0]     s[-1]     s[-2]    s[-3]    s[-4]    s[-5]
 313.2514 3.284057 1.020786 0.9421796 0.9914176 1.175902 1.021539 1.015409
     s[-6]     s[-7]     s[-8]     s[-9]    s[-10]    s[-11]
 0.9618333 0.9807726 0.9727921 0.9532951 0.9873115 0.9767631

  sigma^2:  6e-04

     AIC     AICc      BIC 
5552.291 5553.738 5621.805 
# Generate forecasts
fc_ets <- fit_ets |> forecast(h = 12)
fc_ets |> autoplot(aus_retail)

ARIMA Models

Components

  • AR (Autoregressive): \(y_t = c + \phi_1 y_{t-1} + \phi_2 y_{t-2} + \cdots + \phi_p y_{t-p} + \varepsilon_t\)
  • I (Integrated): Differencing to achieve stationarity
  • MA (Moving Average): \(y_t = c + \varepsilon_t + \theta_1 \varepsilon_{t-1} + \theta_2 \varepsilon_{t-2} + \cdots + \theta_q \varepsilon_{t-q}\)

ARIMA(p,d,q) Model

\((1-\phi_1B-\cdots-\phi_pB^p)(1-B)^d y_t = c + (1+\theta_1B+\cdots+\theta_qB^q)\varepsilon_t\)

Seasonal ARIMA

ARIMA(p,d,q)(P,D,Q)\(_m\) includes: - Non-seasonal part: ARIMA(p,d,q) - Seasonal part: ARIMA(P,D,Q) with period m

# Fit ARIMA model
fit_arima <- aus_retail |>
  model(ARIMA(Turnover))
# View model
report(fit_arima)
Series: Turnover 
Model: ARIMA(3,1,2)(0,1,1)[12] 

Coefficients:
         ar1     ar2     ar3      ma1      ma2     sma1
      0.0626  0.3168  0.3918  -0.7451  -0.2260  -0.4636
s.e.  0.1241  0.0646  0.0558   0.1324   0.1274   0.0452

sigma^2 estimated as 867:  log likelihood=-2054.3
AIC=4122.59   AICc=4122.86   BIC=4151
# Generate forecasts
fc_arima <- fit_arima |> forecast(h = 12)
fc_arima |> autoplot(aus_retail)

Model Selection

# Compare multiple models
fit_all <- aus_retail |>
  model(
    ETS = ETS(Turnover),
    ARIMA = ARIMA(Turnover),
    SNAIVE = SNAIVE(Turnover)
  )
# Model accuracy
accuracy(fit_all)
# A tibble: 3 × 12
  State    Industry      .model .type    ME  RMSE   MAE    MPE  MAPE  MASE RMSSE
  <chr>    <chr>         <chr>  <chr> <dbl> <dbl> <dbl>  <dbl> <dbl> <dbl> <dbl>
1 Victoria Food retaili… ETS    Trai…  1.88  27.3  20.3 0.0787  1.81 0.301 0.337
2 Victoria Food retaili… ARIMA  Trai…  1.87  28.8  21.2 0.0753  1.84 0.315 0.355
3 Victoria Food retaili… SNAIVE Trai… 65.4   81.1  67.3 5.60    5.74 1     1    
# ℹ 1 more variable: ACF1 <dbl>
# Cross-validation
fit_all |>
  forecast(h = 12) |>
  accuracy(aus_retail)
# A tibble: 3 × 12
  .model State    Industry .type    ME  RMSE   MAE   MPE  MAPE  MASE RMSSE  ACF1
  <chr>  <chr>    <chr>    <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 ARIMA  Victoria Food re… Test    NaN   NaN   NaN   NaN   NaN   NaN   NaN    NA
2 ETS    Victoria Food re… Test    NaN   NaN   NaN   NaN   NaN   NaN   NaN    NA
3 SNAIVE Victoria Food re… Test    NaN   NaN   NaN   NaN   NaN   NaN   NaN    NA

Summary

Part 1 Summary: Getting Started

  • Time series: Sequential observations with temporal ordering
  • Components: Trend, seasonality, cycles, irregular
  • Forecasting: Predicting future values for decision-making
  • R tools: fpp3 package with tidyverse approach

Part 2 Summary: Analysis Methods

  • Graphics: Time plots, seasonal plots, decomposition plots
  • Decomposition: STL method for trend and seasonal components
  • Features: ACF, trend strength, seasonal strength
  • Simple methods: Mean, naïve, seasonal naïve, drift

Part 3 Summary: Advanced Methods

  • Exponential smoothing: Weighted averages with ETS framework
  • ARIMA: Autoregressive integrated moving average models
  • Model selection: AIC, cross-validation, accuracy measures

Suggested References

Primary Resource

  • Hyndman, R.J., & Athanasopoulos, G. (2021). Forecasting: Principles and practice (3rd ed.). OTexts: Melbourne, Australia. Available at: https://otexts.com/fpp3/

R Packages

  • fpp3: Data and functions for “Forecasting: Principles and Practice”
  • forecast: Classical forecasting methods
  • seasonal: Seasonal adjustment procedures

Additional Reading

  • Shumway, R.H., & Stoffer, D.S. (2017). Time series analysis and its applications: With R examples
  • Brockwell, P.J., & Davis, R.A. (2016). Introduction to time series and forecasting

Thank You

Questions and Answers

Contact: Kamarul Imran Musa

Resources: