In this post, we will discuss Time Series Analysis in R Programming Language. At its core, time series analysis involves data points collected at regular intervals over time. Unlike standard regression, the order of observations matters because yesterday’s value often influences today’s.
Table of Contents
When working on time series data, we typically look for three components:
- Trend: The long-term increase or decrease.
- Seasonality: Patterns that repeat at fixed intervals (e.g., high ice cream sales every summer).
- Noise (Error): Random variation that cannot be explained by trend or seasonality.
Setting Environment
For this tutorial about Time Series in R, we will use the fpp3 package (Forecasting: Principles and Practice), which bundles tsibble for data handling and feasts for analysis. If fpp3 package is not installed on the system, we need to install and load it as given below:
# Install and load necessary libraries
install.packages("fpp3")
library(fpp3)Creating and Visualizing a Tsibble
R uses a specialized data frame called a tsibble. Let us use the built-in aus_production dataset, which tracks quarterly production in Australia.
# Focus on Beer production
beer_data <- aus_production %>%
select(Quarter, Beer)
# Visualize the data
autoplot(beer_data, Beer) +
labs(title = "Quarterly Beer Production in Australia",
y = "Megalitres", x = "Year")The above plot shows the
- Trend: Look at the “wiggle.” Is it generally going up or down over decades?
- Seasonality: Notice the sharp peaks and valleys that occur at the same time every year. This is a classic seasonal pattern.
Classical Decomposition
To see what is really going on under the hood, we decompose the series.
The STL Decomposition plot shows that
- Trend: Shows the “smoothed” direction of beer production, removing the seasonal noise.
- Season(al): Shows the isolated 4-quarter cycle.
- Remainder: This is the “mess” left over. If you see large spikes here, it means an outlier occurred (like a strike or a sudden economic shift).
Checking for Stationarity
Most forecasting models (like ARIMA) require the data to be stationary, meaning its mean and variance do not change over time. We check this using the ACF (Autocorrelation Function) plot.
beer_data %>% ACF(Beer) %>% autoplot()
- If the bars (lags) are very high and decrease slowly, the data is not stationary (it has a trend).
- Scattered, small bars indicate the data is more like “White Noise.”
A Simple Forecast (The “Naïve” and “SNAïve” Models)
Before using complex AI, we always start with a baseline. The Seasonal Naïve (SNAïve) model simply assumes next year will look exactly like this year.
# Fit a model fit <- beer_data %>% model(SNAIVE(Beer)) # Forecast the next 2 years (8 quarters) forecast_beer <- fit %>% forecast(h = "2 years") # Plot the forecast forecast_beer %>% autoplot(beer_data) + labs(title = "Seasonal Naive Forecast for Beer Production")
- The Blue Line: This is your point estimate (the “best guess”).
- The Shaded Areas: These are Prediction Intervals (usually 80% and 95%). If the shaded area is huge, your model is telling you, “I’m really not sure about this.”



