The Cook-Levin Theorem

College Depth 73 in the knowledge graph I know this Set as goal
Unlocks 15 downstream topics
NP-complete SAT satisfiability Cook-Levin

Core Idea

The Cook-Levin theorem proves that Boolean satisfiability (SAT) is NP-complete — the first problem proven NP-complete (Cook 1971, independently Levin 1973). The proof encodes the computation of an arbitrary NTM as a propositional formula: variables represent tape cells, head positions, and states at each time step, while clauses enforce the transition rules. Since every NP problem reduces to SAT, SAT is the 'universal' hard problem in NP and the historical starting point for the entire theory of NP-completeness.

How It's Best Learned

Work through the tableau construction carefully: understand how an NTM's accepting computation of length t is encoded as a formula of size O(t²). Appreciate that the reduction itself runs in polynomial time even though the formula can be large relative to the original instance.

Common Misconceptions

Explainer

You already know that NP is the class of problems solvable by a nondeterministic Turing machine (NTM) in polynomial time, and that a polynomial-time reduction from problem A to problem B means "if we could solve B quickly, we could solve A quickly." The Cook-Levin theorem asks a sharper question: is there a single problem in NP so hard that *every* NP problem reduces to it? The answer is yes, and that problem is Boolean satisfiability (SAT) — the question of whether a propositional formula over variables x₁, …, xₙ has a truth assignment making it true.

The proof works by encoding computation as logic. Any NP problem has a polynomial-time NTM M that witnesses its solutions. Given an input w of length n, M runs in at most t(n) = nᶜ steps. The trick is to build a propositional formula φ_{M,w} — the tableau formula — whose satisfying assignments correspond exactly to accepting computation histories of M on w. The formula uses three families of variables: cell variables encoding what symbol occupies each tape cell at each time step, head variables encoding where the read/write head is, and state variables encoding M's current state. Together, they describe a t × t grid of "snapshots" of the machine's tape. Clauses then enforce: (1) the initial configuration matches w, (2) each transition follows the NTM's rules, and (3) M reaches an accepting state by step t.

The formula φ_{M,w} has size O(t²) — polynomial in n — and can be constructed in polynomial time. Here is the key insight: φ_{M,w} is satisfiable if and only if M accepts w. An accepting computation history is exactly a satisfying assignment. So the polynomial-time reduction from any NP problem to SAT is: "given an instance w, output the formula φ_{M,w}." This works for *any* NTM M, which means *every* NP problem reduces to SAT. Combined with the fact that SAT is itself in NP (guess an assignment, verify in polynomial time), SAT is NP-complete.

The theorem's historical weight goes beyond the single result. Before Cook and Levin, there was no vocabulary for saying "these hard-looking problems are all equally hard." By identifying SAT as a universal NP problem, the theorem gave complexity theory its organizing principle: to classify a new problem as NP-complete, you only need to show it's in NP and reduce a single known NP-complete problem to it. Every subsequent NP-completeness proof stands on this foundation — SAT is the ancestor of all known NP-complete problems.

One subtlety worth internalizing: the theorem does not say SAT is unsolvable, or that it requires exponential time. It says SAT is as hard as any problem in NP. Whether NP contains problems that *genuinely* require superpolynomial time — whether P ≠ NP — remains open. Cook-Levin identifies SAT as the right problem to study if you want to resolve P vs. NP; it does not resolve the question itself.

Practice Questions 5 questions

Prerequisite Chain

Counting to 10Counting to 20Understanding ZeroThe Number ZeroCounting to FiveOne-to-One CorrespondenceCombining Small Groups Within 5Addition Within 10Addition Within 20Two-Digit Addition Without RegroupingTwo-Digit Addition with RegroupingAddition Within 100Repeated Addition as MultiplicationMultiplication Facts Within 100Division as Equal SharingDivision as Grouping (Measurement Division)Division: Grouping (Repeated Subtraction) ModelDivision: Fair Sharing ModelDivision as Equal SharingDivision as GroupingBasic Division FactsDivision Facts Within 100Two-Digit by One-Digit DivisionDivision with RemaindersRemainders and Quotients in DivisionDivision Word ProblemsIntroduction to Long DivisionFactors and MultiplesPrime and Composite NumbersEquivalent FractionsRelating Fractions and DecimalsDecimal Place ValueReading and Writing DecimalsComparing and Ordering DecimalsAdding and Subtracting DecimalsMultiplying DecimalsDividing DecimalsDividing FractionsMixed Number ArithmeticOrder of OperationsInteger Order of OperationsVariable ExpressionsCombining Like TermsOne-Step EquationsTwo-Step EquationsSolving Multi-Step EquationsEquations with Variables on Both SidesLiteral EquationsSlope-Intercept FormPoint-Slope FormWriting Linear EquationsParallel and Perpendicular Line SlopesGraphing Linear EquationsPiecewise FunctionsStep FunctionsComposition of FunctionsInverse FunctionsRadical Functions and GraphsRational ExponentsExponential Functions and GraphsLogarithms IntroductionBig-O Notation and Asymptotic AnalysisBreadth-First Search (BFS)Shortest Paths in Unweighted GraphsDijkstra's Shortest Path AlgorithmAlgorithm Analysis and Big-O NotationTuring MachinesThe Church-Turing ThesisThe Halting ProblemComputability ReductionsPolynomial-Time ReductionsLower Bounds Techniques in Computational ComplexityNP-CompletenessThe Cook-Levin Theorem

Longest path: 74 steps · 415 total prerequisite topics

Prerequisites (6)

Leads To (3)