← Graph View All Domains

A topic in the Open Knowledge Graph — a free, open map of 15,290 topics and the order to learn them in.

Sparse Signal Recovery and Basis Pursuit

Research Depth 80 in the knowledge graph ☐ I know this ☆ Set as goal

1topic build on this

355prerequisites beneath it

See this on the map →

Fourier Transform: Definition and Properties→→Compressive Sensing and Sparse Recovery

Core Idea

Sparse signal recovery solves the inverse problem: given noisy measurements y = Ax + noise where x is unknown but sparse in some basis, find the sparsest x that explains y. The ℓ₀ problem (minimize count of nonzero coefficients) is combinatorial, but the convex relaxation (minimize ℓ₁ norm) provably recovers the true sparse signal under coherence and identifiability conditions. Basis Pursuit, Iterative Thresholding, and Matching Pursuit algorithms solve this efficiently. Applications include signal denoising, feature selection, source localization, and blind source separation where the number of sources is much smaller than the number of sensors.

How It's Best Learned

Create a synthetic sparse signal (few nonzero coefficients in a dictionary of atoms), corrupt with noise, and solve the ℓ₁ minimization problem: minimize ||x||₁ subject to ||Ax − y||₂ ≤ noise_level (or minimize ||Ax − y||₂² + λ||x||₁ for a Lagrangian form). Use a convex solver and recover the original sparse coefficients. Vary the noise level, dictionary coherence, and sparsity to observe when ℓ₁ recovery succeeds or fails. Compare with ℓ₀ minimization (via exhaustive search or greedy methods) to confirm that ℓ₁ is a good convex proxy.

Common Misconceptions

Sparse recovery is only relevant for signals that are literally sparse (few nonzero samples); signals that are sparse in a transformed domain (Fourier, wavelets, learned dictionaries) also benefit.
The ℓ₁ norm is chosen arbitrarily; it is the tightest convex lower bound of the ℓ₀ norm and has natural interpretations in robust statistics and information theory.
Basis pursuit always recovers the true sparse solution; recovery is guaranteed only under conditions on the dictionary (coherence, restricted isometry) and noise level — adversarial or incoherent dictionaries can fail.

Explainer

From your study of linear algebra and Fourier analysis, you know how to represent signals in different bases and transform between domains. Sparse recovery takes this further: given observations of a signal in one domain, find the sparsest representation in another domain.

The problem is concrete: you have measurements y (from a sensor or compressive sampling), a dictionary of atoms A (a matrix where each column is a possible signal component), and you want to find the sparsest x (fewest nonzero coefficients) such that Ax ≈ y. This is the sparse coding problem. If there is no noise, you solve exactly: Ax = y. If y is noisy or the system is underdetermined, you regularize with a sparsity term.

The ℓ₀ vs ℓ₁ tradeoff is the core of sparse recovery. Minimizing ||x||₀ (the count of nonzeros, "cardinality") is the true sparsity measure, but it is a combinatorial optimization problem: there are C(N,K) possible sparse supports (positions of nonzeros), and finding the best one requires checking all or using branch-and-bound, which is NP-hard. Basis Pursuit replaces ℓ₀ with its convex surrogate ℓ₁: minimize ||x||₁ subject to Ax = y (or Ax ≈ y with noise). This is a linear program, solvable by interior-point methods in polynomial time. The miracle of compressive sensing and sparse recovery is that under conditions on A (incoherence, restricted isometry), the ℓ₁ solution equals the ℓ₀ solution — you solve a hard problem via convex optimization.

Conditions for recovery are encoded in two properties: Mutual Coherence μ(A) (how much dictionary atoms overlap) and Restricted Isometry Property (how well A preserves sparse signals). Low coherence μ means atoms are nearly orthogonal and thus distinguishable in measurement space. If μ is small, you can recover any K-sparse signal where K < 1/(2μ). Structured dictionaries (Fourier, wavelets, random Gaussian) have low coherence by design.

Algorithms for sparse recovery range from convex (basis pursuit, cvx solvers) to greedy (Matching Pursuit, Orthogonal Matching Pursuit). Orthogonal Matching Pursuit (OMP) is intuitive: iteratively find the atom with the highest correlation to the current residual, add it to the support, update the residual, and repeat until the residual is small. OMP is faster than basis pursuit (no LP solver) but slightly suboptimal — it is greedy and does not account for measurement noise as carefully. Iterative Thresholding alternates between a gradient descent step (x ← x + A^T(y − Ax)) and a hard or soft thresholding operation (keep K largest or shrink by λ). Hard thresholding (keeping K largest) is computationally simple and converges if the RIP constant is small.

Applications are broad: signal denoising (represent as sparse in wavelets, shrink small coefficients), feature selection (in regression, add ℓ₁ penalty to select few features), source localization (many microphones, few active sources), radar (map K targets from M measurements), matrix completion (fill missing entries by assuming low-rank, which is sparse in singular value domain). Many machine learning problems are implicitly sparse recovery: regularized regression (LASSO, elastic net) is basis pursuit in the feature space.

The limits of sparse recovery are when the dictionary is wrong (the true signal is not sparse in any known basis), when coherence is high (atoms overlap, indistinguishable in measurements), or when noise is large. Modern extensions address these: dictionary learning (learn the atoms from data rather than use fixed dictionaries), structured sparsity (groups of atoms must be active together, e.g., image blocks), low-rank models (exploit rank structure, not just sparsity), and deep learning (learn end-to-end mappings from measurements to signals, implicitly learning both dictionary and recovery algorithm).

Practice Questions 5 questions