The pumping lemma states: if a language L is regular, then there exists a constant p such that any string z in L with |z| ≥ p can be decomposed as z = uvw where |uv| ≤ p, |v| > 0, and uvⁱw ∈ L for all i ≥ 0. Proof by contradiction using this lemma establishes that a language is not regular.
Work through proofs for standard non-regular languages (e.g., {aⁿbⁿ}). Understand the adversarial game: the pumping lemma chooses p, we choose z, the adversary chooses u, v, w.
You already know that regular languages are closed under union, intersection, and complement — powerful structural properties. But which languages are *not* regular? The pumping lemma gives you a tool for proving that a language lies outside the reach of any finite automaton. It works by exploiting a fundamental limitation: a DFA has finitely many states, so on a sufficiently long input it must revisit some state, creating a loop that can be "pumped" (repeated or removed).
Here is the formal statement: if L is regular, there exists a pumping length p such that any string z in L with |z| ≥ p can be split into three pieces z = uvw satisfying three conditions: (1) |uv| ≤ p, so the loop occurs early in the string, (2) |v| > 0, so the loop is non-empty, and (3) uvⁱw ∈ L for every i ≥ 0, meaning you can repeat the loop portion any number of times (including zero) and the result is still in L. The intuition is direct: the loop in the DFA's computation can be traversed any number of times without leaving the language.
To prove a language is *not* regular, you use the contrapositive: assume L is regular, then show that no matter what p is, you can find a string z in L of length at least p such that *every* valid decomposition z = uvw (obeying |uv| ≤ p and |v| > 0) fails — that is, some pumped string uvⁱw is not in L. This has the structure of an adversarial game. The "adversary" picks p and the decomposition; you pick z and the pumping count i. For example, to show L = {aⁿbⁿ | n ≥ 0} is not regular, choose z = aᵖbᵖ. Since |uv| ≤ p, the substring v consists entirely of a's. Pumping up (i = 2) gives a string with more a's than b's, which is not in L — contradiction.
A critical subtlety: the pumping lemma is a *necessary* condition for regularity, not a sufficient one. Passing the pumping lemma does not prove a language is regular — some non-regular languages happen to satisfy the pumping conditions. To prove regularity, you must construct an automaton or regular expression. The pumping lemma is a one-directional tool: it can only show that a language is *not* regular. Despite this limitation, it is the standard first technique for establishing non-regularity and a gateway to understanding the hierarchy of language classes that lies ahead.