Time Series Decomposition

Quick Summary

Signal & Noise

Time series problems are typically broken down into two pieces - signal and noise. Signal is the piece of your data that is explainable variation, in other words, pattern that exists in the data that can be modeled. Noise is the unexplained variation that exists in data that occurs randomly.

In time series, we extract the signal to extrapolate it into the future and call it forecasting. We use the noise piece of the data to build confidence intervals around those forecasts to show the uncertainty that exists in our forecasts. There are many different techniques (which are discussed in this website) to try and model the signal.

Time Series Decomposition

One way to explore time series data is to break the time series down into pieces that summarize different aspects of the possible variation in the dataset. Three common aspects of time series data are trend, seasons, and cycles. Let’s use a dataset to help explain these. Here is a monthly view of the number of airlines passengers in the United States between 1990 and 2008.

Trend

The trend component of a time series is the long term increase or decrease in the dataset. For example, the above airlines dataset has an overall increase in the number of passengers over the years.

Season

The seasonal component of a time series is the repeated pattern over a fixed period of time. For example, the above airlines dataset shows seasonality every 12 observations (12 months) where we see different patterns in summer and holiday times of the year as compared to other times of the year. Seasonality occurs over a fixed and known period of time.

Cycle

Sometimes cycles and season and confused with each other. While seasonal components occur over a fixed period of time, the cycle component of a time series is a rise and fall in the dataset that do not occur with a fixed frequency. The airline dataset above doesn’t really exhibit a cyclical pattern. The drop in number of passengers is due to the September 11 attacks, not to do some business cycle. Those intervention points that sometimes occur in a dataset will be addressed in the dynamic regression section of the website.

A time series decomposition break a series down into the components of trend / cycle (combined together as \(T_t\) ), season (\(S_t\)), and everything left over - the error (\(E_t\)). This decomposition’s seasonal component’s structure is either additive:

\[ Y_t = T_t + S_t + E_t \]

or multiplicative:

\[ Y_t = T_t \times S_t \times E_t \]

For an additive seasonal structure, the magnitude of the variation in the season around the trend/cycle remains relatively constant as seen on the left hand plot below. A multiplicative seasonal component has proportional changes in the variation of the season around the trend / cycle as seen in the right hand plot below.

Left: Additive Seasonal Effect, Right: Multiplicative Seasonal Effect

Since our dataset is additive in nature, we will focus our analysis and example on an additive seasonal effect. Once these components are calculated, we can isolate individual pieces to view things a little better. For example, maybe we just want to see the underlying trend / cycle of your data:

Another popular thing with time series decomposition is to remove the seasonal component, known as a seasonally adjusted time series. This seasonally adjusted series is the original series after subtracting (due to additive season) the seasonal component or taking the previous trend plot and adding in the error component.

Types of Decomposition

There are a variety of different ways to calculate the components of the time series decomposition. Three of the most popular are:

  1. Classical Decomposition
  2. X-13 ARIMA-SEATS
  3. STL (Seasonal and Trend using LOESS) Decomposition

Let’s briefly talk about each.

Classical Decomposition

Classical decomposition is a straight-forward calculation for time series decomposition. It is the default calculation in SAS software. To understand how classical decomposition works, we need to first understand the idea of a rolling / moving average smoothing calculation.

A rolling window average (sometimes called a moving average) smoothing model just averages observations in windows of time.

\[ \hat{Y}_t = \frac{1}{2k + 1} \sum^k_{j = -k} y_{t+j} \]

The \(2k+1\) is called “m” and is referred to as the order of the moving average (an m-MA). Imagine we only had 5 observations in our dataset and we wanted to calculate a 3-MA. We would have the following table:

The first value of the 3-MA is not able to be calculated because the data doesn’t have a point before and after it. For the second value of the 3-MA it is just the average of the first 3 values of the data, \((34348 + 33536 + 40578) / 3 = 36154\). The same calculation is done for the remaining observations. As the value of “m” gets larger, the more smooth the model as seen below:

For even values of “m” we typically do moving averages of moving averages to smooth things out even more. The details for this isn’t shown here, but can simply be done by taking a moving average of a previous moving average instead of the original data. For seasonal data with a season length of S, we typically build a 2xS-MA model which uses this double smoothing approach.

For classical decomposition, the trend component in the time series is calculated with these m-MA models. For our dataset, it would use the 2x12-MA smoothing model. Once you have the trend, the seasonal component is calculated by first removing the trend from the original data, \(Y_t - T_t\) in our additive data. From there we literally average these de-trended values for each of the seasonal component pieces to get the seasonal series, \(S_t\). For example, in our dataset, we would average all of the de-trended January values to get our calculation for January. We would continue this for each month. This actually highlights one of the limitations of the classical decomposition approach - that the seasonal component can never change over time. Once you have the trend, \(T_t\), and season, \(S_t\), you subtract those from the original series to get the error, \(E_t = Y_t - T_t - S_t\).

X-13 ARIMA-SEATS

The X-13 ARIMA-SEATS decomposition technique was created by the United States Census Bureau and used by many other countries and in their census statistical analysis. It is the successor to the X-11 and X-12 ARIMA techniques. It uses m-MA smoothing approaches, but also implements time series modeling approaches like ARIMA (discussed in the ARIMA section of the website) and SEATS to help fill in blank values in the m-MA calculation. They do this by forecasting enough values on the end of a series and backcast enough values in the beginning of the series to have enough data points to not have any missing values in the m-MA calculation. They also use m-MA smoothing to calculate the seasonal component as well.

STL Decomposition

Seasonal and trend decomposition using local regression (STL) uses local regression (LOESS) in place of moving averages to calculate both the trend and seasonal components of the series. LOESS is a nonlinear regression technique that uses the simplicity of linear regression applied over small subsets of the data to help it calculate a nonlinear smoothing curve. The smoothing curve is calculated through a weighted least squares regression on a subset of data (window) that moves along the whole series. Let’s look at the flow chart below:

In the first plot above, we isolate out a small window (the first 12 observations). The remaining calculations are done on every window of this size in the data, similar to the m-MA approach described previously. Within this window let’s select a single point as highlighted by the bigger dot. We then uses a weighted linear regression on all the points in the window as shown in the second plot. Any point in the data outside of the window gets no weight in our regression. The closer the observations are to our point of interest the more weight (darker the points above) in the linear regression. We use that linear regression to get a predicted value for that observation which is highlighted by the crossed point above. Imagine we did the exact same process for every point in our dataset as highlighted in the last plot above. We get all of these predicted observations from each of the weight linear regressions (one for every point) and then “connect” them together to form our smoothed curve.

STL decomposition (which is easily implemented in both R and Python) uses LOESS to calculate both the trend, \(T_t\), and seasonal, \(S_t\), components in the decomposition. This allows the seasonal component to adjust over time as compared to the fixed approach of classical decomposition.

Let’s see how to calculate each of these components with our software!