A topic in the Open Knowledge Graph — a free, open map of 15,290 topics and the order to learn them in.

PCP Theorem and Hardness of Approximation

Research Depth 106 in the knowledge graph ☐ I know this ☆ Set as goal

535prerequisites beneath it

Approximation Algorithms (LP Relaxation and Primal-Dual)Hardness of Approximation +2 more→

Core Idea

The PCP (Probabilistically Checkable Proofs) theorem states that every language in NP has a proof system where a verifier reads only O(1) bits of the proof and uses O(log n) random bits, yet accepts valid proofs with probability 1 and rejects invalid proofs with probability at least 1/2. This seemingly technical statement has a revolutionary consequence: it is NP-hard to approximate MAX-3SAT beyond a ratio of 7/8 (and many other optimization problems beyond specific thresholds). Dinur's 2007 combinatorial proof via gap amplification on constraint graphs replaced the original algebraic proof and made the result more accessible. The PCP theorem is the theoretical engine behind all inapproximability results — without it, we would have no way to prove that certain approximation ratios cannot be beaten unless P = NP.

Explainer

The PCP theorem is arguably the most important result in computational complexity theory since the Cook-Levin theorem. Its statement — NP = PCP(log n, 1) — looks purely complexity-theoretic, but its consequences for algorithm design are profound and practical. It says that every NP proof can be reformatted so that a verifier, tossing O(log n) coins and reading O(1) bits, can detect cheating with constant probability. This is remarkable: you can probabilistically verify a proof of arbitrary length by looking at a constant number of bits.

The connection to approximation algorithms comes through gap-introducing reductions. The standard way to prove an optimization problem is hard to approximate is to show a reduction from SAT that maps YES instances to solutions with value at least c and NO instances to solutions with value at most s, where c/s (or s/c for maximization) is the approximation gap. Before the PCP theorem, we could only show NP-hardness of exact optimization — the gap was infinitesimally small. The PCP theorem creates a constant gap from the binary YES/NO distinction of NP, because the verifier's constant soundness error translates directly into a constant fraction of unsatisfied constraints. Hastad's seminal 1997 results, building on the PCP theorem, established tight inapproximability thresholds: MAX-3SAT cannot be approximated beyond 7/8, MAX-3LIN cannot be approximated beyond 1/2, and Set Cover cannot be approximated beyond (1-epsilon) * ln n. Each of these matches the ratio achieved by known algorithms.

Dinur's 2007 proof replaced the original algebraic proof (which relied on low-degree polynomials over finite fields, the sumcheck protocol, and composition of verifiers) with a purely combinatorial argument based on gap amplification. She starts with a constraint graph where satisfiable instances have value 1 and unsatisfiable instances have value at most 1 - 1/poly(n) — an almost invisible gap. Through O(log n) rounds of powering the constraint graph (which spreads local violations to neighbors via expander-like properties) and alphabet reduction (which keeps the constraint description compact), each round roughly doubles the fraction of violated constraints in the NO case. After logarithmically many rounds, the gap reaches a constant, completing the proof. This combinatorial approach is not just simpler — it reveals that gap amplification is the essential mechanism, and that the algebraic machinery of the original proof was one implementation of this mechanism.

The Unique Games Conjecture (UGC), proposed by Khot in 2002, extends the PCP program by asserting that a specific type of constraint satisfaction problem (unique label cover) is hard to approximate. If true, the UGC would imply that many of the best known approximation algorithms are optimal: the Goemans-Williamson ~0.878-approximation for MAX-CUT, the 2-approximation for Vertex Cover, and more generally, that semidefinite programming relaxations capture the exact approximability threshold for a wide class of problems. Whether the UGC is true remains one of the central open problems in theoretical computer science, but even as a conjecture, it has been enormously productive — generating new algorithms, new proof techniques, and a conceptual framework that unifies inapproximability results across dozens of problems.

Practice Questions 5 questions