A topic in the Open Knowledge Graph — a free, open map of 15,290 topics and the order to learn them in.

Chomsky Normal Form (CNF)

Graduate Depth 88 in the knowledge graph ☐ I know this ☆ Set as goal

227topics build on this

402prerequisites beneath it

Context-Free Grammars (CFGs)Parse Trees, Derivations, and Ambiguity in CFGs→→CYK Algorithm and Membership Testing Pumping Lemma for Context-Free Languages

Core Idea

Chomsky Normal Form (CNF) is a standardized form for CFGs in which every production is either A → BC (two variables) or A → a (one terminal). Every context-free language has a CFG in CNF. Converting a grammar to CNF involves eliminating ε-productions, unit productions (A → B), and long productions, then ensuring only binary or terminal rules remain. CNF simplifies proofs about CFGs and enables the CYK algorithm for O(n³) parsing of any CFG. Parse trees for CNF grammars are full binary trees.

How It's Best Learned

Practice the four-step conversion (eliminate ε-rules, unit rules, useless symbols, then binarize) on a concrete grammar. Verify that the converted grammar generates the same language (minus ε if it was originally in the language).

Common Misconceptions

Thinking CNF conversion changes the language — it preserves the language (up to possible exclusion of ε).
Applying the steps out of order, which can reintroduce problems already eliminated.

Explainer

You already know that a context-free grammar (CFG) is a set of production rules that generate strings by substituting variables with combinations of variables and terminals. You've also seen that the same language can be generated by many different grammars — some elegant, some messy. Chomsky Normal Form is a way of rewriting any CFG into a standardized shape where every rule follows one of exactly two patterns: a variable produces two variables (`A → BC`), or a variable produces a single terminal (`A → a`). That's it. No rules with three symbols on the right, no rules that produce the empty string, no rules where one variable simply renames another.

Why bother with such a restrictive format? Because uniformity enables algorithms. When every rule either splits into exactly two branches or produces a single character, the parse tree becomes a full binary tree — every internal node has exactly two children. This rigid structure means that parsing a string of length *n* requires exactly *n* - 1 branching steps and *n* terminal steps, no matter the grammar. The CYK algorithm exploits this by filling in an *n* × *n* table, checking every possible way to split every substring into two halves, and determining in O(n³) time whether the string belongs to the language. Without CNF, you'd need to handle rules of arbitrary length, making a clean dynamic-programming approach much harder to formulate.

The conversion process itself is mechanical but must be done in the right order. First, you eliminate ε-productions (rules like `A → ε`) by finding every variable that can derive the empty string and creating new rules that account for its absence. Second, you eliminate unit productions (rules like `A → B` that just rename one variable to another) by tracing chains of renames and replacing them with direct rules. Third, you remove useless symbols — variables that can never be reached from the start symbol or that can never produce a terminal string. Finally, you binarize any remaining long rules: a rule like `A → BCDE` becomes `A → BX₁`, `X₁ → CX₂`, `X₂ → DE`, using fresh variables to break it into a chain of binary splits. Each step may create new issues that a previous step would have handled, which is why the ordering matters — eliminating ε-productions first prevents them from reappearing when you collapse unit rules.

The key conceptual point is that CNF conversion never changes the language — the new grammar generates exactly the same set of strings as the original (with the minor caveat that the empty string ε is excluded from CNF grammars, since there's no way to derive it without an ε-production). This is a powerful demonstration of a recurring theme in theory of computation: different representations of the same object can have very different algorithmic properties, and finding the right normal form transforms hard problems into tractable ones.

Practice Questions 5 questions

Prerequisite Chain

Understanding Zero → The Number Zero → Counting to Five → Counting to 10 → Counting to 20 → Counting a Set of Objects Up to 20 → Cardinality: The Last Number Counted → Matching Numerals to Quantities → Subitizing Small Quantities → Addition Within 10 → Number Bonds to 10 → Addition Within 20 → Doubles and Near Doubles → Doubles Facts Within 10 → Near Doubles Facts Within 20 → Mental Math Strategies for Addition → Mental Math: Adding and Subtracting Tens → Addition Within 100 → Repeated Addition as Multiplication → Multiplication as Equal Groups → Multiplication: Arrays → Basic Multiplication Facts (0s, 1s, 2s, 5s, 10s) → Multiplication Facts Within 100 → Division as Equal Sharing → Division as Grouping (Measurement Division) → Division: Grouping (Repeated Subtraction) Model → Division: Fair Sharing Model → Division as Equal Sharing → Division as Grouping → Basic Division Facts → Division Facts Within 100 → Multiplication and Division Fact Families → Relationship Between Multiplication and Division → Division Facts as Inverse of Multiplication → Remainders and Quotients in Division → Division Word Problems → Multi-Step Word Problems → Solving Multi-Step Word Problems → Multiplication Word Problems → Division Word Problems → Introduction to Long Division → Factors and Multiples → Prime and Composite Numbers → Equivalent Fractions → Relating Fractions and Decimals → Decimal Place Value → Integers and the Number Line → Comparing and Ordering Integers → Absolute Value → Adding Integers → Subtracting Integers → Multiplying Integers → Introduction to Exponents → Order of Operations → Integer Order of Operations → Variable Expressions → The Distributive Property → Variables and Expressions Review → Introduction to Polynomials → Adding and Subtracting Polynomials → Multiplying Polynomials → Factorial → Permutations → Combinations → Counting Principles: Addition and Multiplication Rules → Introduction to Graph Theory → Propositional Logic Foundations → Logical Equivalences → Boolean Algebra → Boolean Type and Truth Values → Comparison Operators and Boolean Tests → Logical Operators and Boolean Algebra → Boolean Algebra and Fundamental Laws → Logic Gates Fundamentals → Implementing Boolean Functions with Gates → Karnaugh Map Simplification → Combinational Circuit Design → Flip-Flops and Latches → Finite State Machines (FSMs) → Deterministic Finite Automata (DFA) → Nondeterministic Finite Automata (NFA) → Two-Way Finite Automata → NFA to DFA Conversion (Subset Construction) → DFA Properties and Minimization Algorithms → Regular Languages: Definition and Characterization → Context-Free Grammars (CFGs) → Context-Free Grammar Properties and Ambiguity → Parse Trees, Derivations, and Ambiguity in CFGs → Chomsky Normal Form (CNF)

Longest path: 89 steps · 402 total prerequisite topics

Prerequisites (2)

Context-Free Grammars (CFGs)hard Parse Trees, Derivations, and Ambiguity in CFGssoft

Leads To (2)

CYK Algorithm and Membership Testinghard Pumping Lemma for Context-Free Languagessoft