Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

3.4 Autocorrelation

Autocorrelation Definition

As discussed in Chapter 2, covariance, and by extension autocovariance, has some undesirable characteristics. For this reason, in time series analysis we often prefer to work with the autocorrelation, which is a unitless quantity bounded by [1,1][-1,1].

For any finite variance process, the autocorrelation is simply the autocovariance normalized by the the variances.

ρ(s,t)=γ(s,t)γ(s,s)γ(t,t).\rho(s,t) = \frac{\gamma(s,t)}{\sqrt{\gamma(s,s)\gamma(t,t)}}.

When considering stationary time series, we can employ the simplifications

γ(s,t)=γ(st)=γ(h),\gamma(s,t)=\gamma(|s-t|) =\gamma(h),

and

γ(s,s)=γ(t,t)=γ(0).\gamma(s,s)=\gamma(t,t)=\gamma(0).

As such, a stationary time series has autocorrelation given by

ρ(h)=γ(h)γ(0).\rho(h) = \frac{\gamma(h)}{\gamma(0)}.

Note that ρ(h)=ρ(h)\rho(h)=\rho(-h) and ρ(0)=1\rho(0)=1.

Autocorrelation is Positive Semidefinite

The autocorrelation matrix P\mathbf{P} is related to the autocovariance matrix Γ\boldsymbol{\Gamma} by

P=1γ(0)Γ.\mathbf{P} = \frac{1}{\gamma(0)}\boldsymbol{\Gamma}.

γ(0)\gamma(0) is the variance, and hence cannot be negative, so vTPv\mathbf{v}^T\mathbf{P}\mathbf{v} will only differ from vTΓv\mathbf{v}^T\boldsymbol{\Gamma}\mathbf{v} by a positive constant term given by 1γ(0)\frac{1}{\gamma(0)}. As a result, since Γ\boldsymbol{\Gamma} is positive semidefinite, P\mathbf{P} must also be as well.

Autocorrelation Plots

It is often instructive to plot the autocorrelation of a time series. statsmodels has a nice built-in function to create plots:

from statsmodels.graphics.tsaplots import plot_acf
plot_acf(my_time_series, lags=20)

where the parameter lags controls the maximum value of hh. lags=20 is generally a good starting choice for most scenarios, though for longer patterns such as hourly data with a 24-hour period you may want to set lags to something like 72 or 96.

plot_acf has other arguments that we will discuss once we’ve covered more theory. For now, let’s take a look at a couple of different autocorrelation plots to get a feeling for how they behave.

Sample autocorrelation plot from statsmodels with sudden cutoff point. Note that lag 0 is always equal to 1.

Figure 1:Sample autocorrelation plot from statsmodels with sudden cutoff point. Note that lag 0 is always equal to 1.

Figure 1 shows two significant lags (plus the value of 1 at h=0h=0), and falls to statistical insignificance thereafter as denoted by the blue shading. Later on we will see that this is a hallmark characteristic of a moving average process.

Sample autocorrelation plot from statsmodels with exponential decay.

Figure 2:Sample autocorrelation plot from statsmodels with exponential decay.

Figure 2 demonstrates an exponentially decaying autocorrelation that remains statistically significant even at h=20h=20. In later chapters we will see that this behavior corresponds to an autoregressive process.