A process X(t) is strictly stationary if its finite-dimensional distributions are invariant under time shifts: (X(t₁+h), ..., X(tₙ+h)) has the same joint distribution as (X(t₁), ..., X(tₙ)) for all h. Wide-sense (weak) stationarity requires only constant mean and a covariance function R(τ) = Cov(X(t), X(t+τ)) that depends only on the lag τ. The spectral representation theorem connects stationary processes to their power spectral density via the Fourier transform of the autocovariance.
Stationarity captures the idea that a process's statistical character doesn't change over time. The strong form — strict stationarity — requires that time-shifting the entire process leaves all finite-dimensional distributions unchanged. The weaker but more practical form — wide-sense (or second-order) stationarity — requires only that the mean E[X(t)] = μ is constant and the autocovariance Cov(X(t), X(t+τ)) = R(τ) depends only on the time lag τ, not on the absolute time t. For Gaussian processes, the two notions coincide because Gaussian distributions are determined by their first two moments.
The autocovariance function R(τ) encodes the memory structure of a stationary process. It must be even (R(-τ) = R(τ)), positive semi-definite, and achieves its maximum at τ = 0 (R(0) = Var(X(t))). The rate at which R(τ) decays determines how quickly the process "forgets" its past: exponential decay R(τ) = σ²e^{-α|τ|} (the OU process) indicates a specific memory timescale 1/α, while power-law decay R(τ) ~ |τ|^{-β} indicates long-range dependence with no characteristic timescale. Brownian motion is not stationary (its variance grows), but its increment process X(t) = W(t+1) - W(t) is stationary with R(τ) that vanishes for |τ| > 1.
The spectral representation connects the time domain to the frequency domain. The Wiener-Khinchin theorem states that the power spectral density S(ω) = ∫R(τ)e^{-iωτ}dτ is the Fourier transform of the autocovariance, and conversely R(τ) = (1/2π)∫S(ω)e^{iωτ}dω. The spectral density S(ω) ≥ 0 describes how the process's variance is distributed across frequencies. White noise has flat S(ω) = σ² (equal power at all frequencies); the OU process has Lorentzian S(ω) = 2ασ²/(α² + ω²) (low-pass filtered); a periodic process has S(ω) concentrated at its fundamental frequency and harmonics.
Stationarity is both a modeling assumption and a mathematical prerequisite. In time series analysis and signal processing, stationarity is typically assumed so that the autocovariance and spectrum are well-defined and estimable from data. In stochastic process theory, stationarity is a property that diffusions achieve in the long run — the Ornstein-Uhlenbeck process converges to its stationary distribution regardless of initial conditions. Understanding stationarity is essential for ergodic theory (time averages of stationary ergodic processes converge to ensemble averages) and for the spectral theory of stochastic processes.