Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

6.2 Autoregressive Models

We begin our journey into ARMA models by discussing autoregressive, or AR models (the “AR” in ARMA). As the name implies, AR models can be thought of as the application of linear regression to time series by regressing a time series onto lagged versions of itself[1].

Random Walk to AR(1)

Recall that a random walk is a model in which each time step’s value is determined by the previous time step’s value plus random noise, i.e.

xt=xt1+wt.x_t = x_{t-1} + w_t.

A random walk is not stationary due to its non-constant variance, making AR models not applicable. However, we could imagine a slightly different time series defined as

xt=ϕxt1+wt,ϕ<1.x_t = \phi x_{t-1} + w_t, \qquad |\phi|<1.

AR(1) Stationarity

Eq. (2) is a first order autoregressive process, denoted as AR(1). We will demonstrate that Eq. (2) describes a stationary process by iterating backwards and examining the properties of the resulting iterated series:

xt=ϕxt1+wt=ϕ(ϕxt2+wt1)+wt=ϕ2xt2+ϕwt1+wt=ϕ2(ϕxt3+wt2)+ϕwt1+wt=j=0ϕjwtj\begin{split} x_t &= \phi x_{t-1} + w_t\\ &=\phi (\phi x_{t-2}+ w_{t-1}) + w_t\\ &=\phi^2 x_{t-2} + \phi w_{t-1} + w_t\\ &=\phi^2(\phi x_{t-3} + w_{t-2}) + \phi w_{t-1} + w_t\\ &\ldots\\ &=\sum_{j=0}^{\infty} \phi^j w_{t-j} \end{split}

In coming sections, we will learn that Eq. (3) is the infinite moving average, or MA(\infty) representation of an AR process. For our purposes, we can think of it as a way to shed light on the behavior of an AR process.

AR(1) Mean

From the last line of Eq. (3) , we conclude that the mean of an AR(1) process is

E[xt]=E[j=0ϕjwtj]=j=0ϕjE[wtj]=0.\begin{split} \mathbb{E}[x_t] &= \mathbb{E}\Big[\sum_{j=0}^{\infty} \phi^j w_{t-j}\Big]\\ &=\sum_{j=0}^{\infty} \phi^j \mathbb{E}[w_{t-j}]\\ &=0. \end{split}

AR(1) Autocovariance

The autocovariance can be derived analogously. For any stationary AR(1) process with zero mean, γ(h)=Cov(xt+h,xt)\gamma(h) =\text{Cov}(x_{t+h}, x_t). Should the mean not be zero, replace xtx_t with xtμxx_t-\mu_x. We can then derive γ(h)\gamma(h) as

γ(h)=Cov(xt+h,xt)=E[(j=0ϕjwt+hj)(k=0ϕkwtk)]=E[(wt+h++ϕhwt+ϕh+1wt1+)(wt+ϕwt1+)].\begin{split} \gamma(h) &=\text{Cov}(x_{t+h}, x_t)\\ &=\mathbb{E}\Big[\Big(\sum_{j=0}^{\infty}\phi^j w_{t+h-j}\Big)\Big(\sum_{k=0}^{\infty} \phi^k w_{t-k}\Big)\Big]\\ &=\mathbb{E} [(w_{t+h} + \ldots + \phi^h w_t + \phi^{h+1} w_{t-1} + \ldots)(w_t + \phi w_{t-1} + \ldots)]. \end{split}

Noting that the covariance for the wtw_t’s is Cov(wi,wj)=δijσw2\text{Cov}(w_i, w_j) = \delta_{ij}\sigma_w^2, we can line up all non-zero contributions stemming from cross-terms where wt+hjw_{t+h−j} in the left series matches wtkw_{t-k} in the right series. This will require hj=kh−j=−k i.e. j=h+kj=h+k, giving us

γ(h)=σw2k=0ϕh+kϕk=σw2ϕhk=0ϕ2k=σw2ϕh1ϕ2,\begin{split} \gamma(h) &= \sigma_w^2 \sum_{k=0}^{\infty} \phi^{h+k} \phi^k\\ &=\sigma_w^2 \phi^h \sum_{k=0}^{\infty} \phi^{2k}\\ &=\sigma_w^2 \frac{\phi^h}{1-\phi^2}, \end{split}

where we have cast the autocovariance as an infinite geometric series. Provided that σw2\sigma_w^2 is finite, Eq. (6) will be finite for all values of hh. Finally, the last line of Eq. (6) depends solely on the separation hh, completing the proof that Eq. (2) represents a stationary process.

AR(1) Autocorrelation

Recall that the autocorrelation ρ(h)\rho(h) for a stationary process is given by

ρ(h)=γ(h)γ(0).\rho(h)\stackrel{\triangle}{=}\frac{\gamma(h)}{\gamma(0)}.

Plugging in the result from Eq. (6), we obtain:

ρ(h)=σw2ϕh1ϕ2σw211ϕ2=ϕh\begin{split} \rho(h) &= \frac{\sigma_w^2 \frac{\phi^h}{1-\phi^2}}{\sigma_w^2 \frac{1}{1-\phi^2}}\\ &= \phi^h \end{split}

In the event that ϕ<0\phi<0, ρ(h)\rho(h) will also die off in a sinusoidal fashion, alternating between positive and negative values due to serial negative correlation. These two possibilities are demonstrated in Figure 1.

Theoretical autocorrelation for stationary AR(1) processes with \phi>0 and \phi<0.

Figure 1:Theoretical autocorrelation for stationary AR(1) processes with ϕ>0\phi>0 and ϕ<0\phi<0.

AR(1) with Nonzero Mean

Up to this point, we’ve assumed that our AR(1) model has a mean of 0 (or that we subtracted the mean prior to our analysis), in which case ϕ\phi exerts a sort of gravitational pull to bring values back to 0 by damping out previous noise. We can extend this to an AR(1) process with a nonzero mean μ\mu by subtracting the mean from each observation in the AR model itself

xtμ=ϕ(xt1μ)+wtxt=ϕxt1ϕμ+μ+wt=ϕxt1+(1ϕ)μ+wt=α+ϕxt1+wt,\begin{split} x_t-\mu &= \phi(x_{t-1}-\mu) + w_t\\ x_t &= \phi x_{t-1} -\phi \mu + \mu + w_t\\ &= \phi x_{t-1} + (1-\phi)\mu + w_t\\ &= \alpha + \phi x_{t-1} + w_t, \end{split}

where α=μ(1ϕ)\alpha \stackrel{\triangle}=\mu(1-\phi) functions the same way that an intercept term would in standard linear regression.

AR(pp) Models

It’s straightforward to generalize AR models to higher order AR(pp) models with p1p\geq1 by regressing the time series onto versions of itself of increasing lag. A general AR(pp) model is given as

xt=ϕ1xt1+ϕ2xt2++ϕpxtp+wt,ϕp0.x_t = \phi_1 x_{t-1} + \phi_2 x_{t-2} + \ldots + \phi_p x_{t-p} + w_t, \qquad \phi_p \neq 0.

A series with a nonzero mean is handled as

xtμ=ϕ1(xt1μ)+ϕ2(xt2μ)++ϕp(xtpμ)+wtxt=μ(1ϕ1ϕ2ϕp)+ϕ1xt1+ϕ2xt2++ϕpxtp+wt=α+ϕ1xt1+ϕ2xt2++ϕpxtp+wt,\begin{split} x_t-\mu &= \phi_1 (x_{t-1}-\mu) + \phi_2 (x_{t-2}-\mu) + \ldots + \phi_p (x_{t-p}-\mu) + w_t\\ x_t &= \mu(1-\phi_1-\phi_2-\ldots-\phi_p) + \phi_1 x_{t-1} + \phi_2 x_{t-2} + \ldots + \phi_p x_{t-p} + w_t\\ &= \alpha + \phi_1 x_{t-1} + \phi_2 x_{t-2} + \ldots + \phi_p x_{t-p} + w_t, \end{split}

where as before the intercept is given as α=μ(1ϕ1ϕ2ϕp)\alpha \stackrel{\triangle}=\mu(1-\phi_1-\phi_2-\ldots-\phi_p).

Deriving the theoretical autocovariance and autocorrelation functions of an AR(pp) process is somewhat more involved than for an AR(1) model. We will defer the derivation until after we have covered representing AR models as infinite series of noise terms in section 4.

AR Models in Backshift Notation

Autoregressive Operator

Recall from Eq. (14) in section 4.3 that the backshift operator B\mathbb{B} increments a time series back by one time step. Using this definition, we may express Eq. (2) as

xt=ϕBxt+wtxtϕBxt=wt(1ϕB)xt=wt.\begin{split} &x_t = \phi\,\mathbb{B}\,x_{t} + w_t\\ &x_t - \phi\,\mathbb{B}\,x_{t} = w_t\\ &(1 - \phi\,\mathbb{B})\,x_{t} = w_t.\\ \end{split}

We can expand this definition to represent AR(pp) models using backshift notation as

(1ϕ1Bϕ2B2ϕpBp)xt=wt,(1-\phi_1 \mathbb{B} - \phi_2 \mathbb{B}^2 -\ldots-\phi_p \mathbb{B}^p)x_t = w_t,

or in more compact form

ϕ(B)xt=wt,\phi(\mathbb{B})x_t = w_t,

where ϕ(B)\phi(\mathbb{B}) is the autoregressive operator defined as:

ϕ(B)=1ϕ1Bϕ2B2ϕpBp.\phi(\mathbb{B}) \stackrel{\triangle}{=} 1-\phi_1 \mathbb{B} -\phi_2 \mathbb{B}^2 - \ldots - \phi_p \mathbb{B}^p.

For stationary AR models (ϕ<1|\phi|<1 for AR(1)), we can define an inverse autoregressive operator ϕ1(B)\phi^{-1}(\mathbb{B})

ϕ1(B)ϕ(B)xt=ϕ1(B)wt\phi^{-1}(\mathbb{B})\phi(\mathbb{B})x_t = \phi^{-1}(\mathbb{B})w_t

or

xt=ϕ1(B)wt.x_t = \phi^{-1}(\mathbb{B})w_t.

For an AR(1) model, given that ϕ1(B)=11ϕB\phi^{-1}(\mathbb{B}) = \frac{1}{1-\phi \mathbb{B}} and ϕ<1|\phi|<1, we can represent ϕ1(B)\phi^{-1}(\mathbb{B}) as an infinite geometric series

ϕ1(B)=1+ϕB+ϕ2B2+ϕ3B3+\phi^{-1}(\mathbb{B}) = 1 + \phi \mathbb{B} + \phi^2 \mathbb{B}^2 + \phi^3 \mathbb{B}^3 + \ldots

Causal AR Models

What is the purpose of representing AR processes in the form of Eq. (13) or (14)? To answer this, begin by noting that Eq. (18) implies that xt=wt+ϕwt1+ϕ2wt2+x_t = w_t + \phi w_{t-1} + \phi^2 w_{t-2}+\ldots, as derived in Eq. (3). This is an example of a causal model, in which a series can be treated as being generated by an infinite series of noise terms. Random walks, on the other hand, are non-causal as they will increase without bound as we include more and more time steps. Series with ϕ>1|\phi|>1 are referred to as explosive as past noise becomes arbitrarily magnified as ϕ\phi is raised to increasing powers—exactly what happens in a physical explosion (at least until the fuel is exhausted).

Non-causal processes such as unit root and explosive processes are not stationary, making many of the tools used in time series analysis unsuitable. We can easily determine if an AR(1) process is causal by examining ϕ\phi, how might we do the same for higher order AR(pp) models? Imagine if we could factor an AR model such that

(1ϕ1Bϕ2B2ϕpBp)xt=(1φ1B)(1φ2B)(1φpB)xt.(1-\phi_1 \mathbb{B} - \phi_2 \mathbb{B}^2 -\ldots-\phi_p \mathbb{B}^p)x_t=(1-\varphi_1^{\prime} \mathbb{B})(1-\varphi_2^{\prime} \mathbb{B})\ldots(1-\varphi_p^{\prime} \mathbb{B})x_t.

Given such a factoring, we could quickly determine if our AR(pp) model was causal—and hence stationary—simply by confirming that all φ\varphi^{\prime} 's have an absolute value less than 1. While there is no guarantee this will be possible for real φ\varphi^{\prime}'s, the fundamental theorem of algebra does ensure this is possible for complex values[2].

Let’s explore this with an concrete example. Consider the second order autoregressive, or AR(2), model defined as

xt=1.25xt10.375xt2+wt.x_t = 1.25x_{t-1} - 0.375 x_{t-2} + w_t.

Looking at Eq. (20), it’s not immediately obvious if its φ\varphi^{\prime}'s are greater than one. Let’s recast it using the autoregressive operator ϕ(B)\phi(\mathbb{B})

xt=1.25xt10.375xt2+wtxt1.25xt1+0.375xt2=wt(11.25B+0.375B2)xt=wt.\begin{split} x_t = 1.25\,x_{t-1} - 0.375\, x_{t-2} + w_t\\ x_t - 1.25\,x_{t-1} + 0.375\, x_{t-2} = w_t\\ (1-1.25\,\mathbb{B} + 0.375\,\mathbb{B}^2)x_t = w_t. \end{split}

Evidently, ϕ(B)=11.25B+0.375B2=154B+38B2\phi(\mathbb{B})=1-1.25\,\mathbb{B} + 0.375\,\mathbb{B}^2=1-\frac{5}{4}\,\mathbb{B} + \frac{3}{8}\,\mathbb{B}^2. Now comes the crucial step, we replace the backshift operator with a variable, call it zz, and solve for the zeros of the polynomial ϕ(z)\phi(z).

154z+38z2=0(112z)(134z)=0z=2orz=43.\begin{split} 1-\frac{5}{4}\,z + \frac{3}{8}\,z^2 &= 0\\ (1-\frac{1}{2}\,z)(1-\frac{3}{4}\,z) &= 0\\ z = 2 \quad \text{or} \quad z=\frac{4}{3}. \end{split}

Eq. (20) is stationary because the φ\varphi^{\prime}'s of 12\frac{1}{2} and 34\frac{3}{4} have absolute values less than one (lie inside the unit circle[3]). Equivalently, the roots of ϕ(z)\phi(z) (and consequently of ϕ(B)\phi(\mathbb{B})) of 2 and 43\frac{4}{3} have absolute values greater than one (lie outside the unit circle).

Sign of ϕ\phi and Complex Roots

As seen above, the signal characteristic of stationary AR models is the presence of exponentially decaying autocovariance (and consequently autocorrelation) functions. For an AR(pp) process with p2p\geq2, the decay may also exhibit sinusoidal oscillations, potentially with a period greater than pp.

Theoretical autocorrelation for stationary AR(2) processes with complex roots.

Figure 2:Theoretical autocorrelation for stationary AR(2) processes with complex roots.

If the above fails to render correctly in your browser you can also open the demo as a new browser window using the Open Demo in a New Tab ↗ button at the top of the frame. Note that you may need to enable popups for this to work.

By the quadratic formula, an AR(2) process with associated polynomial ϕ(z)=1ϕ1zϕ2z2\phi(z)=1-\phi_1 z - \phi_2 z^2 will be stationary only if

ϕ1±ϕ12+4ϕ22ϕ2>1\Bigg|\frac{\phi_1\pm\sqrt{\phi_1^2+4\phi_2}}{-2\phi_2}\Bigg| > 1

Eq. (23) can be broken into three distinct requirements:

  1. ϕ1+ϕ2<1\phi_1 + \phi_2 < 1

  2. ϕ2ϕ1<1\phi_2 - \phi_1 < 1

  3. ϕ2>1\phi_2 > -1

Some sources state the third requirement as ϕ2<1|\phi_2|<1, though this is not strictly necessary as combining the first two requirements above already enforces ϕ2<1\phi_2<1.

From Eq. (23) we see that the roots of an AR(2) model will be complex if ϕ12+4ϕ2<0\phi_1^2+4\phi_2<0, i.e. if ϕ2<ϕ124\phi_2<-\frac{\phi_1^2}{4}. In Sec. 4 of this chapter we will see that AR processes with complex roots have a special property of exhibiting a “pseudo-seasonality.” These conditions are depicted in the Figure 3.

Values of \phi_1 and \phi_2 for AR(2) process demonstrating the boundary conditions for stationarity and real/complex roots.

Figure 3:Values of ϕ1\phi_1 and ϕ2\phi_2 for AR(2) process demonstrating the boundary conditions for stationarity and real/complex roots.

Higher Order Causal Models

We could expand the process above for finding unit roots in AR(2) models to higher order AR(pp) models, but in practice there’s no need to do this by hand. statsmodels.tsa.arima_process.ArmaProcess provides theoretical properties of AR (and more broadly ARMA) models for us. The following code demonstrates using this module both to determine stationarity writ large and to extract the roots of an AR model. Note that the sign convention follows ϕ(B)\phi(\mathbb{B}) from Eq. (21), not Eq. (20).

from statsmodels.tsa.arima_process import ArmaProcess
ar2 = ArmaProcess(ar=[1, -1.25, 0.375], # use phi values defined via phi(B)
                  ma = None, # pure AR process, no MA component
                 )
print(f"AR(2) model is stationary: {ar2.isstationary}")
print(f"Roots of AR(2) model: {ar2.arroots}")

Where do AR Processes Arise?

In economics are related disciplines, AR models are often referred to as “long-memory” models. This makes sense, as exponentially decaying autocorrelation means that a noise term, or “shock,” will take many steps to be “forgotten” (i.e. fade to statistical insignificance). Such a model is appropriate for a wide range of scenarios, ranging from climate science to economic inflation and stock market returns. In the following problem, we will explore using AR processes to get a baseline approximation to solar activity.

Footnotes
  1. Yule’s original 1927 paper (Yule (1927)) introducing autoregressive models motivated the concept by drawing an analogy to a randomly perturbed oscillatory system—in effect discretizing the differential equations governing damped harmonic oscillators (though Yule himself did not directly use differential equation terminology). Modern students coming from a data science or statistics background are generally more comfortable interpreting AR models as form of linear regression in which the features are lagged versions of the time series itself.

  2. The fundamental theorem of algebra states that any polynomial of degree pp has exactly pp roots (provided we allow for complex roots). The theorem allows for a single root to appear multiple times such as in the case of 1+2x+x2=(1+x)21+2x+x^2=(1+x)^2, which has the root x=1x=-1 with multiplicity 2.

  3. The unit circle is simply the set of all complex numbers such that a+bi=1|a+bi|=1, or equivalently all complex numbers of the form eiθ,θ[0,2π)e^{i\theta},\, \theta\in[0,2\pi).

  4. The actual requirement is derived by using the reciprocal definition of the roots zz from the definition we used. Our definition using φ\varphi^{\prime} expresses the same idea from a different angle.

References
  1. Granger, C. W. J., & Newbold, P. (1974). Spurious regressions in econometrics. Journal of Econometrics, 2(2), 111–120. https://doi.org/10.1016/0304-4076(74)90034-7
  2. Shumway, R. H., & Stoffer, D. S. (2025). Time Series Analysis and Its Applications. In Springer Texts in Statistics. Springer Nature Switzerland. 10.1007/978-3-031-70584-7
  3. Yule, G. U. (1927). VII. On a method of investigating periodicities disturbed series, with special reference to Wolfer’s sunspot numbers. Philosophical Transactions of the Royal Society of London. Series A, Containing Papers of a Mathematical or Physical Character, 226(636–646), 267–298.