← Graph View All Domains

A topic in the Open Knowledge Graph — a free, open map of 15,290 topics and the order to learn them in.

Peephole Optimization

Graduate Depth 95 in the knowledge graph ☐ I know this ☆ Set as goal

11topics build on this

528prerequisites beneath it

See this on the map →

Basic Block Analysis Code Generation from IR→→Code Optimization Fundamentals Procedure Inlining Optimization

Core Idea

Peephole optimization examines small windows of code to replace inefficient instruction sequences with faster equivalents. For example, a load-then-store becomes a move, and consecutive jumps are collapsed. It's language and platform independent, making it a final polish pass in code generation.

Explainer

After a compiler generates code — whether intermediate representation or actual machine instructions — the result is often locally wasteful. Earlier compilation phases focus on correctness and handle one construct at a time, which means they produce sequences that are correct but clumsy when viewed together. Peephole optimization is the clean-up crew: it slides a small window (the "peephole," typically 2–5 instructions wide) across the generated code and applies pattern-matching rules to replace inefficient sequences with better ones. You already know about basic blocks from your prerequisite work — peephole optimization typically operates within a single basic block, making it a purely local transformation.

The classic example is redundant load-store elimination. Suppose the code generator produces `STORE R1, [addr]` followed immediately by `LOAD R2, [addr]`. The peephole optimizer recognizes that the value just stored is still in R1, so it replaces the pair with `MOV R2, R1` — eliminating an unnecessary memory access. Another common pattern is jump chaining: if instruction A jumps to label L1, and L1 contains nothing but a jump to L2, the optimizer rewrites A to jump directly to L2. Other patterns include replacing `x = x + 0` with nothing, replacing `x = x * 1` with nothing, and strength-reducing `x * 2` to `x << 1`.

What makes peephole optimization elegant is its simplicity. Each rule is a small, self-contained pattern match: "if you see this sequence, replace it with that sequence." The rules don't need to understand the program's overall structure, data flow, or control flow — they just match local instruction patterns. This means peephole optimizers are easy to implement, easy to verify for correctness, and easy to extend with new rules. They compose well with other optimization passes too: running peephole optimization after other transformations often catches inefficiencies that those transformations introduced.

Despite its simplicity, peephole optimization can be surprisingly effective. It typically runs as one of the last passes in the compilation pipeline, after instruction selection and register allocation. Those earlier phases sometimes introduce awkward instruction sequences — a register allocator might insert a spill and reload that turns out to be unnecessary, or instruction selection might produce a two-instruction sequence where a single specialized instruction exists. The peephole pass catches these cases cheaply. In practice, compilers often run peephole optimization iteratively, since replacing one pattern can expose new opportunities — collapsing a jump chain might place two redundant loads adjacent, which the next pass eliminates.

Practice Questions 5 questions

Prerequisite Chain

Understanding Zero → The Number Zero → Counting to Five → Counting to 10 → Counting to 20 → Counting a Set of Objects Up to 20 → Cardinality: The Last Number Counted → Matching Numerals to Quantities → Subitizing Small Quantities → Addition Within 10 → Number Bonds to 10 → Addition Within 20 → Doubles and Near Doubles → Doubles Facts Within 10 → Near Doubles Facts Within 20 → Mental Math Strategies for Addition → Mental Math: Adding and Subtracting Tens → Addition Within 100 → Repeated Addition as Multiplication → Multiplication as Equal Groups → Multiplication: Arrays → Basic Multiplication Facts (0s, 1s, 2s, 5s, 10s) → Multiplication Facts Within 100 → Division as Equal Sharing → Division as Grouping (Measurement Division) → Division: Grouping (Repeated Subtraction) Model → Division: Fair Sharing Model → Division as Equal Sharing → Division as Grouping → Basic Division Facts → Division Facts Within 100 → Multiplication and Division Fact Families → Relationship Between Multiplication and Division → Division Facts as Inverse of Multiplication → Remainders and Quotients in Division → Division Word Problems → Multi-Step Word Problems → Solving Multi-Step Word Problems → Multiplication Word Problems → Division Word Problems → Introduction to Long Division → Factors and Multiples → Prime and Composite Numbers → Equivalent Fractions → Relating Fractions and Decimals → Decimal Place Value → Integers and the Number Line → Comparing and Ordering Integers → Absolute Value → Adding Integers → Subtracting Integers → Multiplying Integers → Introduction to Exponents → Order of Operations → Integer Order of Operations → Variable Expressions → The Distributive Property → Variables and Expressions Review → Introduction to Polynomials → Adding and Subtracting Polynomials → Multiplying Polynomials → Factorial → Permutations → Combinations → Counting Principles: Addition and Multiplication Rules → Introduction to Graph Theory → Propositional Logic Foundations → Logical Equivalences → Boolean Algebra → Boolean Type and Truth Values → Comparison Operators and Boolean Tests → Logical Operators and Boolean Algebra → Boolean Algebra and Fundamental Laws → Logic Gates Fundamentals → Implementing Boolean Functions with Gates → Karnaugh Map Simplification → Combinational Circuit Design → Flip-Flops and Latches → Finite State Machines (FSMs) → Deterministic Finite Automata (DFA) → Nondeterministic Finite Automata (NFA) → Two-Way Finite Automata → NFA to DFA Conversion (Subset Construction) → DFA Properties and Minimization Algorithms → Regular Languages: Definition and Characterization → Context-Free Grammars (CFGs) → Context-Free Grammar Properties and Ambiguity → Parse Trees, Derivations, and Ambiguity in CFGs → Context-Free Grammars in Compiler Design → Abstract Syntax Trees (ASTs) → Symbol Tables and Scope Resolution → Semantic Analysis Phase → Intermediate Code Representation → Control Flow Graphs → Basic Block Analysis → Peephole Optimization

Longest path: 96 steps · 528 total prerequisite topics

Prerequisites (2)

Code Generation from IRhard Basic Block Analysishard

Leads To (2)

Code Optimization Fundamentalssoft Procedure Inlining Optimizationsoft