Chomsky Normal Form (CNF)

Graduate Depth 57 in the knowledge graph I know this Set as goal
Unlocks 218 downstream topics
CNF normal-form CFG CYK

Core Idea

Chomsky Normal Form (CNF) is a standardized form for CFGs in which every production is either A → BC (two variables) or A → a (one terminal). Every context-free language has a CFG in CNF. Converting a grammar to CNF involves eliminating ε-productions, unit productions (A → B), and long productions, then ensuring only binary or terminal rules remain. CNF simplifies proofs about CFGs and enables the CYK algorithm for O(n³) parsing of any CFG. Parse trees for CNF grammars are full binary trees.

How It's Best Learned

Practice the four-step conversion (eliminate ε-rules, unit rules, useless symbols, then binarize) on a concrete grammar. Verify that the converted grammar generates the same language (minus ε if it was originally in the language).

Common Misconceptions

Explainer

You already know that a context-free grammar (CFG) is a set of production rules that generate strings by substituting variables with combinations of variables and terminals. You've also seen that the same language can be generated by many different grammars — some elegant, some messy. Chomsky Normal Form is a way of rewriting any CFG into a standardized shape where every rule follows one of exactly two patterns: a variable produces two variables (`A → BC`), or a variable produces a single terminal (`A → a`). That's it. No rules with three symbols on the right, no rules that produce the empty string, no rules where one variable simply renames another.

Why bother with such a restrictive format? Because uniformity enables algorithms. When every rule either splits into exactly two branches or produces a single character, the parse tree becomes a full binary tree — every internal node has exactly two children. This rigid structure means that parsing a string of length *n* requires exactly *n* - 1 branching steps and *n* terminal steps, no matter the grammar. The CYK algorithm exploits this by filling in an *n* × *n* table, checking every possible way to split every substring into two halves, and determining in O(n³) time whether the string belongs to the language. Without CNF, you'd need to handle rules of arbitrary length, making a clean dynamic-programming approach much harder to formulate.

The conversion process itself is mechanical but must be done in the right order. First, you eliminate ε-productions (rules like `A → ε`) by finding every variable that can derive the empty string and creating new rules that account for its absence. Second, you eliminate unit productions (rules like `A → B` that just rename one variable to another) by tracing chains of renames and replacing them with direct rules. Third, you remove useless symbols — variables that can never be reached from the start symbol or that can never produce a terminal string. Finally, you binarize any remaining long rules: a rule like `A → BCDE` becomes `A → BX₁`, `X₁ → CX₂`, `X₂ → DE`, using fresh variables to break it into a chain of binary splits. Each step may create new issues that a previous step would have handled, which is why the ordering matters — eliminating ε-productions first prevents them from reappearing when you collapse unit rules.

The key conceptual point is that CNF conversion never changes the language — the new grammar generates exactly the same set of strings as the original (with the minor caveat that the empty string ε is excluded from CNF grammars, since there's no way to derive it without an ε-production). This is a powerful demonstration of a recurring theme in theory of computation: different representations of the same object can have very different algorithmic properties, and finding the right normal form transforms hard problems into tractable ones.

Practice Questions 5 questions

Prerequisite Chain

Counting to 10Counting to 20Understanding ZeroThe Number ZeroCounting to FiveOne-to-One CorrespondenceCombining Small Groups Within 5Addition Within 10Addition Within 20Two-Digit Addition Without RegroupingTwo-Digit Addition with RegroupingAddition Within 100Repeated Addition as MultiplicationMultiplication Facts Within 100Division as Equal SharingDivision as Grouping (Measurement Division)Division: Grouping (Repeated Subtraction) ModelDivision: Fair Sharing ModelDivision as Equal SharingDivision as GroupingBasic Division FactsDivision Facts Within 100Two-Digit by One-Digit DivisionDivision with RemaindersRemainders and Quotients in DivisionDivision Word ProblemsIntroduction to Long DivisionFactors and MultiplesPrime and Composite NumbersEquivalent FractionsRelating Fractions and DecimalsDecimal Place ValueReading and Writing DecimalsComparing and Ordering DecimalsAdding and Subtracting DecimalsMultiplying DecimalsDividing DecimalsDividing FractionsMixed Number ArithmeticOrder of OperationsOperators and ExpressionsArithmetic Operators and Operator PrecedenceComparison Operators and Boolean TestsLogical Operators and Boolean AlgebraBoolean Algebra and Fundamental LawsCombinational Circuit DesignFlip-Flops and LatchesFinite State Machines (FSMs)Deterministic Finite Automata (DFA)Nondeterministic Finite Automata (NFA)Two-Way Finite AutomataNFA to DFA Conversion (Subset Construction)DFA Properties and Minimization AlgorithmsRegular Languages: Definition and CharacterizationContext-Free Grammars (CFGs)Context-Free Grammar Properties and AmbiguityParse Trees, Derivations, and Ambiguity in CFGsChomsky Normal Form (CNF)

Longest path: 58 steps · 250 total prerequisite topics

Prerequisites (2)

Leads To (2)