Nonparametric spectral estimation makes minimal assumptions about signal structure, relying on Fourier-based methods. Periodogram, Welch method, and multitaper method are common; each involves tradeoffs between spectral leakage (windowing), resolution (segment length), variance (averaging), and computational cost. These methods are robust but have lower resolution than parametric approaches.
You know from window functions and spectral leakage that multiplying a finite-length signal by a window in time is equivalent to convolving its spectrum with the window's Fourier transform in frequency. And from power spectral density estimation, you know the goal: estimate how signal power is distributed across frequencies. Nonparametric methods pursue this goal by computing Fourier transforms of the data directly, without assuming the signal follows any parametric model (ARMA, sinusoids, etc.). The methods are more general and robust, but navigating their tradeoffs is the core skill.
The simplest nonparametric estimator is the periodogram: take N data samples, apply a window, compute the squared magnitude of the DFT, scale by N. The periodogram has fine frequency resolution Δf = f_s/N — the longer your record, the more closely spaced your frequency bins. Its critical weakness is high variance: the periodogram is an *inconsistent* estimator, meaning its statistical fluctuations do not shrink as N grows. Each DFT bin has approximately a chi-squared distribution with 2 degrees of freedom, regardless of record length. The result looks jagged and noisy even for a signal with a perfectly clean spectrum. You cannot trust any individual bin — adjacent bins are nearly uncorrelated, so every bin represents an independent, noisy estimate. This is the fundamental tension in spectral estimation: resolution versus variance.
The Welch method attacks variance through segment-and-average. Divide the N-sample record into L overlapping segments of length M (typically 50% overlap), window each, compute a periodogram for each, and average. Averaging L periodograms reduces variance by approximately L, but each segment has only M samples, so resolution degrades: Δf = f_s/M. Choosing segment length M is the central design decision — longer segments yield higher resolution but fewer averaged segments (higher variance); shorter segments reduce variance but blur closely spaced spectral features. Overlap (typically 50%) extracts more segments without a full proportional cost in resolution — it is almost free variance reduction. The Welch method is the workhorse of practical signal analysis and is what MATLAB's `pwelch` and Python's `scipy.signal.welch` implement.
The multitaper method (Thomson, 1982) solves the resolution-variance tradeoff differently. Rather than segmenting, it applies K orthogonal Slepian tapers (discrete prolate spheroidal sequences) to the full N-sample record, computes a periodogram for each, and averages. Each Slepian taper is mathematically optimized to concentrate its energy within a specified bandwidth W — the bandwidth-time product NW controls the tradeoff. With K = 2NW − 1 tapers, you obtain K nearly independent estimates from the same full-length record, reducing variance by K without sacrificing the full-record frequency resolution. The price is that spectral features closer than W Hz are blurred together, and K and W must be chosen by the analyst. Multitaper is preferred for signals with high dynamic range — widely varying spectral amplitudes across frequency — where leakage from large peaks could swamp small nearby features. It is standard in geophysics, neuroscience, and climatology for precisely this reason.