A context-free grammar (CFG) is a 4-tuple (V, Σ, R, S) of variables, terminals, production rules, and a start variable. Each rule replaces a single variable with any string of variables and terminals. The language of a CFG is the set of terminal strings derivable from the start variable. CFGs are strictly more powerful than regular expressions — they can describe languages like {aⁿbⁿ} that no DFA can recognize. CFGs are the foundation of programming language syntax, used in parser generators and compilers.
Begin by writing grammars for arithmetic expressions (which naturally requires recursion) and palindromes. Trace leftmost and rightmost derivations step-by-step. Then attempt to characterize which languages require CFGs versus which are regular.
You have already seen that regular languages — described by regular expressions and recognized by finite automata — have a fundamental limitation: they cannot count. A DFA has no memory beyond its current state, which means it cannot verify that a string has exactly as many a's as b's. Context-free grammars (CFGs) overcome this by introducing recursive structure, and recursion is the key insight.
A CFG is a 4-tuple (V, Σ, R, S): a set of variables (non-terminals), a terminal alphabet, a set of production rules, and a start variable. Each rule has the form A → α, where A is a single variable and α is any string of variables and terminals. A derivation begins at S, repeatedly replaces a variable using some rule, and terminates when no variables remain. The language of the grammar is the set of all terminal strings reachable this way. Notice that derivations can be arbitrarily long — this is where the expressive power comes from.
The name "context-free" refers to the rule form: the left side is always a single variable, with no surrounding context constraining when the rule can fire. This is in contrast to context-sensitive grammars, where a rule A → B might only apply when A appears between specific symbols. Context-free rules are simple but powerful enough to capture nested structure, which is exactly what programming languages require — parentheses must balance, if-else blocks must be properly nested, function calls must close with matching arguments.
A crucial concept is ambiguity. A string w is ambiguous in grammar G if G admits two or more distinct parse trees for w. This is not just a theoretical curiosity: an ambiguous grammar for a programming language means a statement like 1 + 2 * 3 could be parsed as (1 + 2) * 3 or 1 + (2 * 3), giving different results. Grammar designers go to considerable effort to eliminate ambiguity, often by introducing operator precedence rules directly into the grammar structure.
CFGs sit in the middle of the Chomsky hierarchy: strictly more powerful than regular languages (they can express { aⁿbⁿ }), but strictly less powerful than context-sensitive and Turing-complete grammars (they cannot express { aⁿbⁿcⁿ }). The decision problem for CFLs — membership testing — is solvable in polynomial time (CYK algorithm runs in O(n³)), which makes CFGs practical for real compilers. Every time a compiler parses your source code, it is essentially running a CFL recognition algorithm on a carefully designed CFG.