Sparse signal recovery solves the inverse problem: given noisy measurements y = Ax + noise where x is unknown but sparse in some basis, find the sparsest x that explains y. The ℓ₀ problem (minimize count of nonzero coefficients) is combinatorial, but the convex relaxation (minimize ℓ₁ norm) provably recovers the true sparse signal under coherence and identifiability conditions. Basis Pursuit, Iterative Thresholding, and Matching Pursuit algorithms solve this efficiently. Applications include signal denoising, feature selection, source localization, and blind source separation where the number of sources is much smaller than the number of sensors.
Create a synthetic sparse signal (few nonzero coefficients in a dictionary of atoms), corrupt with noise, and solve the ℓ₁ minimization problem: minimize ||x||₁ subject to ||Ax − y||₂ ≤ noise_level (or minimize ||Ax − y||₂² + λ||x||₁ for a Lagrangian form). Use a convex solver and recover the original sparse coefficients. Vary the noise level, dictionary coherence, and sparsity to observe when ℓ₁ recovery succeeds or fails. Compare with ℓ₀ minimization (via exhaustive search or greedy methods) to confirm that ℓ₁ is a good convex proxy.
From your study of linear algebra and Fourier analysis, you know how to represent signals in different bases and transform between domains. Sparse recovery takes this further: given observations of a signal in one domain, find the sparsest representation in another domain.
The problem is concrete: you have measurements y (from a sensor or compressive sampling), a dictionary of atoms A (a matrix where each column is a possible signal component), and you want to find the sparsest x (fewest nonzero coefficients) such that Ax ≈ y. This is the sparse coding problem. If there is no noise, you solve exactly: Ax = y. If y is noisy or the system is underdetermined, you regularize with a sparsity term.
The ℓ₀ vs ℓ₁ tradeoff is the core of sparse recovery. Minimizing ||x||₀ (the count of nonzeros, "cardinality") is the true sparsity measure, but it is a combinatorial optimization problem: there are C(N,K) possible sparse supports (positions of nonzeros), and finding the best one requires checking all or using branch-and-bound, which is NP-hard. Basis Pursuit replaces ℓ₀ with its convex surrogate ℓ₁: minimize ||x||₁ subject to Ax = y (or Ax ≈ y with noise). This is a linear program, solvable by interior-point methods in polynomial time. The miracle of compressive sensing and sparse recovery is that under conditions on A (incoherence, restricted isometry), the ℓ₁ solution equals the ℓ₀ solution — you solve a hard problem via convex optimization.
Conditions for recovery are encoded in two properties: Mutual Coherence μ(A) (how much dictionary atoms overlap) and Restricted Isometry Property (how well A preserves sparse signals). Low coherence μ means atoms are nearly orthogonal and thus distinguishable in measurement space. If μ is small, you can recover any K-sparse signal where K < 1/(2μ). Structured dictionaries (Fourier, wavelets, random Gaussian) have low coherence by design.
Algorithms for sparse recovery range from convex (basis pursuit, cvx solvers) to greedy (Matching Pursuit, Orthogonal Matching Pursuit). Orthogonal Matching Pursuit (OMP) is intuitive: iteratively find the atom with the highest correlation to the current residual, add it to the support, update the residual, and repeat until the residual is small. OMP is faster than basis pursuit (no LP solver) but slightly suboptimal — it is greedy and does not account for measurement noise as carefully. Iterative Thresholding alternates between a gradient descent step (x ← x + A^T(y − Ax)) and a hard or soft thresholding operation (keep K largest or shrink by λ). Hard thresholding (keeping K largest) is computationally simple and converges if the RIP constant is small.
Applications are broad: signal denoising (represent as sparse in wavelets, shrink small coefficients), feature selection (in regression, add ℓ₁ penalty to select few features), source localization (many microphones, few active sources), radar (map K targets from M measurements), matrix completion (fill missing entries by assuming low-rank, which is sparse in singular value domain). Many machine learning problems are implicitly sparse recovery: regularized regression (LASSO, elastic net) is basis pursuit in the feature space.
The limits of sparse recovery are when the dictionary is wrong (the true signal is not sparse in any known basis), when coherence is high (atoms overlap, indistinguishable in measurements), or when noise is large. Modern extensions address these: dictionary learning (learn the atoms from data rather than use fixed dictionaries), structured sparsity (groups of atoms must be active together, e.g., image blocks), low-rank models (exploit rank structure, not just sparsity), and deep learning (learn end-to-end mappings from measurements to signals, implicitly learning both dictionary and recovery algorithm).