We begin our journey into ARMA models by discussing autoregressive, or AR models (the “AR” in ARMA). As the name implies, AR models can be thought of as the application of linear regression to time series by regressing a time series onto lagged versions of itself[1].
A random walk is not stationary due to its non-constant variance, making AR models not applicable. However, we could imagine a slightly different time series defined as
Eq. (2) is a first order autoregressive process, denoted as AR(1). We will demonstrate that Eq. (2) describes a stationary process by iterating backwards and examining the properties of the resulting iterated series:
In coming sections, we will learn that Eq. (3) is the infinite moving average, or MA(∞) representation of an AR process. For our purposes, we can think of it as a way to shed light on the behavior of an AR process.
The autocovariance can be derived analogously. For any stationary AR(1) process with zero mean, γ(h)=Cov(xt+h,xt). Should the mean not be zero, replace xt with xt−μx. We can then derive γ(h) as
Noting that the covariance for the wt’s is Cov(wi,wj)=δijσw2, we can line up all non-zero contributions stemming from cross-terms where wt+h−j in the left series matches wt−k in the right series. This will require h−j=−k i.e. j=h+k, giving us
where we have cast the autocovariance as an infinite geometric series. Provided that σw2 is finite, Eq. (6) will be finite for all values of h. Finally, the last line of Eq. (6) depends solely on the separation h, completing the proof that Eq. (2) represents a stationary process.
In the event that ϕ<0, ρ(h) will also die off in a sinusoidal fashion, alternating between positive and negative values due to serial negative correlation. These two possibilities are demonstrated in Figure 1.
Figure 1:Theoretical autocorrelation for stationary AR(1) processes with ϕ>0 and ϕ<0.
Up to this point, we’ve assumed that our AR(1) model has a mean of 0 (or that we subtracted the mean prior to our analysis), in which case ϕ exerts a sort of gravitational pull to bring values back to 0 by damping out previous noise. We can extend this to an AR(1) process with a nonzero mean μ by subtracting the mean from each observation in the AR model itself
It’s straightforward to generalize AR models to higher order AR(p) models with p≥1 by regressing the time series onto versions of itself of increasing lag. A general AR(p) model is given as
where as before the intercept is given as α=△μ(1−ϕ1−ϕ2−…−ϕp).
Deriving the theoretical autocovariance and autocorrelation functions of an AR(p) process is somewhat more involved than for an AR(1) model. We will defer the derivation until after we have covered representing AR models as infinite series of noise terms in section 4.
Recall from Eq. (14) in section 4.3 that the backshift operator B increments a time series back by one time step. Using this definition, we may express Eq. (2) as
What is the purpose of representing AR processes in the form of Eq. (13) or (14)? To answer this, begin by noting that Eq. (18) implies that xt=wt+ϕwt−1+ϕ2wt−2+…, as derived in Eq. (3). This is an example of a causal model, in which a series can be treated as being generated by an infinite series of noise terms. Random walks, on the other hand, are non-causal as they will increase without bound as we include more and more time steps. Series with ∣ϕ∣>1 are referred to as explosive as past noise becomes arbitrarily magnified as ϕ is raised to increasing powers—exactly what happens in a physical explosion (at least until the fuel is exhausted).
Non-causal processes such as unit root and explosive processes are not stationary, making many of the tools used in time series analysis unsuitable. We can easily determine if an AR(1) process is causal by examining ϕ, how might we do the same for higher order AR(p) models? Imagine if we could factor an AR model such that
Given such a factoring, we could quickly determine if our AR(p) model was causal—and hence stationary—simply by confirming that all φ′'s have an absolute value less than 1. While there is no guarantee this will be possible for real φ′'s, the fundamental theorem of algebra does ensure this is possible for complex values[2].
Let’s explore this with an concrete example. Consider the second order autoregressive, or AR(2), model defined as
Evidently, ϕ(B)=1−1.25B+0.375B2=1−45B+83B2. Now comes the crucial step, we replace the backshift operator with a variable, call it z, and solve for the zeros of the polynomial ϕ(z).
Eq. (20) is stationary because the φ′'s of 21 and 43 have absolute values less than one (lie inside the unit circle[3]). Equivalently, the roots of ϕ(z) (and consequently of ϕ(B)) of 2 and 34 have absolute values greater than one (lie outside the unit circle).
As seen above, the signal characteristic of stationary AR models is the presence of exponentially decaying autocovariance (and consequently autocorrelation) functions. For an AR(p) process with p≥2, the decay may also exhibit sinusoidal oscillations, potentially with a period greater than p.
Figure 2:Theoretical autocorrelation for stationary AR(2) processes with complex roots.
If the above fails to render correctly in your browser you can also open the demo as a new browser window using the Open Demo in a New Tab ↗ button at the top of the frame. Note that you may need to enable popups for this to work.
By the quadratic formula, an AR(2) process with associated polynomial ϕ(z)=1−ϕ1z−ϕ2z2 will be stationary only if
Eq. (23) can be broken into three distinct requirements:
ϕ1+ϕ2<1
ϕ2−ϕ1<1
ϕ2>−1
Some sources state the third requirement as ∣ϕ2∣<1, though this is not strictly necessary as combining the first two requirements above already enforces ϕ2<1.
From Eq. (23) we see that the roots of an AR(2) model will be complex if ϕ12+4ϕ2<0, i.e. if ϕ2<−4ϕ12. In Sec. 4 of this chapter we will see that AR processes with complex roots have a special property of exhibiting a “pseudo-seasonality.” These conditions are depicted in the Figure 3.
Figure 3:Values of ϕ1 and ϕ2 for AR(2) process demonstrating the boundary conditions for stationarity and real/complex roots.
We could expand the process above for finding unit roots in AR(2) models to higher order AR(p) models, but in practice there’s no need to do this by hand. statsmodels.tsa.arima_process.ArmaProcess provides theoretical properties of AR (and more broadly ARMA) models for us. The following code demonstrates using this module both to determine stationarity writ large and to extract the roots of an AR model. Note that the sign convention follows ϕ(B) from Eq. (21), not Eq. (20).
from statsmodels.tsa.arima_process import ArmaProcess
ar2 = ArmaProcess(ar=[1, -1.25, 0.375], # use phi values defined via phi(B)
ma = None, # pure AR process, no MA component
)
print(f"AR(2) model is stationary: {ar2.isstationary}")
print(f"Roots of AR(2) model: {ar2.arroots}")
In economics are related disciplines, AR models are often referred to as “long-memory” models. This makes sense, as exponentially decaying autocorrelation means that a noise term, or “shock,” will take many steps to be “forgotten” (i.e. fade to statistical insignificance). Such a model is appropriate for a wide range of scenarios, ranging from climate science to economic inflation and stock market returns. In the following problem, we will explore using AR processes to get a baseline approximation to solar activity.
Yule’s original 1927 paper (Yule (1927)) introducing autoregressive models motivated the concept by drawing an analogy to a randomly perturbed oscillatory system—in effect discretizing the differential equations governing damped harmonic oscillators (though Yule himself did not directly use differential equation terminology). Modern students coming from a data science or statistics background are generally more comfortable interpreting AR models as form of linear regression in which the features are lagged versions of the time series itself.
The fundamental theorem of algebra states that any polynomial of degree p has exactlyp roots (provided we allow for complex roots). The theorem allows for a single root to appear multiple times such as in the case of 1+2x+x2=(1+x)2, which has the root x=−1 with multiplicity 2.
The actual requirement is derived by using the reciprocal definition of the roots z from the definition we used. Our definition using φ′ expresses the same idea from a different angle.
Shumway, R. H., & Stoffer, D. S. (2025). Time Series Analysis and Its Applications. In Springer Texts in Statistics. Springer Nature Switzerland. 10.1007/978-3-031-70584-7
Yule, G. U. (1927). VII. On a method of investigating periodicities disturbed series, with special reference to Wolfer’s sunspot numbers. Philosophical Transactions of the Royal Society of London. Series A, Containing Papers of a Mathematical or Physical Character, 226(636–646), 267–298.