A stochastic process is ergodic if time averages converge to ensemble averages: (1/T)∫₀ᵀ f(X(t))dt → E_π[f(X)] almost surely as T → ∞, where π is the stationary distribution. Ergodicity means a single long trajectory explores the entire state space representatively — you don't need many independent samples, just one long run. For diffusions, ergodicity follows from the existence of a unique stationary distribution and appropriate recurrence conditions.
Ergodic theory connects the time behavior of a single trajectory to the statistical properties of the process's stationary distribution. The fundamental ergodic theorem for stochastic processes states: if X(t) is a stationary ergodic process with stationary distribution π, then (1/T)∫₀ᵀ f(X(t))dt → E_π[f(X)] almost surely as T → ∞, for any integrable function f. The time average over one long path equals the ensemble average over the stationary distribution. This is the continuous-time analogue of the ergodic theorem for measure-preserving transformations.
For diffusion processes dX = μ(X)dt + σ(X)dW, ergodicity boils down to two ingredients: the existence of a unique stationary distribution π, and positive recurrence (the process returns to compact sets in finite expected time). The stationary distribution exists when the drift is mean-reverting — pulling the process back from infinity — and the diffusion coefficient σ(x) > 0 ensures accessibility (the process can reach any state from any other state). One-dimensional diffusions are particularly well-understood: the stationary density π(x) ∝ (1/σ²(x))exp(2∫μ(x)/σ²(x)dx) exists whenever this expression is integrable, and the process is ergodic whenever the stationary distribution exists and is unique.
The distinction between stationarity and ergodicity is subtle but fundamental. A process is stationary if its statistical properties don't change over time — the distribution of X(t) is the same as X(t+s) for all s. A process is ergodic if, additionally, a single trajectory is representative of the whole distribution. The classic counterexample is a mixture: pick a random mean μ from {-1, +1} with equal probability, then run an OU process with that mean forever. The combined process is stationary (the marginal distribution at each time is the same mixture), but not ergodic — a single trajectory stays near whichever mean was chosen and never explores the other component. Ergodicity requires mixing: the process must eventually visit all parts of its stationary distribution.
The practical consequence of ergodicity is enormous for Monte Carlo methods. Computing E_π[f] by drawing independent samples from π may be impractical — the distribution might be high-dimensional and analytically intractable. Instead, simulate the process X(t) for a long time and use the time average as an estimator. Ergodicity guarantees convergence; the mixing time (how quickly the process forgets its initial condition) determines the convergence rate. This is the foundation of Markov chain Monte Carlo (MCMC): design a Markov process whose stationary distribution is the target, run it, and collect time averages. Ergodic theory provides the theoretical guarantee that this procedure works.
No topics depend on this one yet.