Good compilers do not stop on syntax errors; they recover and attempt to parse the rest of the file. Recovery strategies include token deletion, insertion, replacement, and panic mode. Effective recovery requires careful synchronization point selection.
Implement error recovery in a parser and test with intentionally malformed files. Study how real compilers recover.
Perfect error recovery is possible (recovery is inherently heuristic). Simpler recovery is always worse (sometimes it is better for clarity).
A compiler that stops at the first syntax error is nearly useless in practice. A programmer with a 10,000-line file containing three typos should not have to fix one, recompile, fix the next, recompile again, and so on. Error recovery allows the parser to report the first error, skip past the damage, resynchronize with the input, and continue parsing to find additional errors in a single pass. The goal is not to guess what the programmer meant — it is to minimize the cascade of spurious errors that follow from a single mistake.
The simplest and most widely used strategy is panic mode recovery. When the parser detects an error, it discards input tokens until it finds a synchronization token — typically a semicolon, closing brace, or keyword that reliably marks the start of a new statement or declaration. The parser then resets its state to one that can accept that token and resumes normal parsing. From your knowledge of recursive descent and LALR parsing, you can see why this works: these synchronization points correspond to places where the grammar has well-defined entry points. A semicolon ends a statement, so the parser can safely begin looking for the next statement.
More sophisticated strategies attempt finer-grained recovery. Token insertion assumes a token was accidentally omitted and inserts it (for example, inserting a missing semicolon). Token deletion assumes an extra token was typed and skips it. Token replacement assumes one token was mistyped as another. These phrase-level repairs can produce better error messages — "expected `;` before `}`" is more helpful than "unexpected `}`" — but they risk cascading errors if the repair is wrong. A misguided insertion can push the parser into a state where everything that follows looks wrong, generating dozens of meaningless error messages from a single mistake.
The art of error recovery lies in choosing synchronization points and repair strategies that minimize cascading. Practical compilers often combine strategies: attempt a local repair first (insert or delete a single token), and if that fails, fall back to panic mode. Some parsers track an error count and suppress error messages for a few tokens after each recovery, since errors reported immediately after a recovery are likely spurious. The key insight is that error recovery is inherently heuristic — there is no algorithm that can always determine what the programmer intended. The measure of quality is pragmatic: does the compiler report the real errors and suppress the noise?