A compiler's code generator produces the instruction sequence: STORE R1, [addr] followed immediately by LOAD R2, [addr]. What does a peephole optimizer do with this, and why?
AIt removes both instructions because the value in R1 must still be needed later and the load is forward-looking
BIt replaces the pair with MOV R2, R1, because the value just stored to [addr] is already in R1, eliminating the redundant memory access
CIt reorders the instructions so the LOAD precedes the STORE to improve cache performance
DIt does nothing because peephole optimization only handles jump instructions, not memory operations
This is the classic 'redundant load-store elimination' pattern. The peephole optimizer sees that immediately after storing R1 to [addr], the code loads from [addr] into R2. Since the value just written is still in R1, the load is unnecessary — R2 can simply receive the value directly from R1. The pair is replaced with MOV R2, R1, eliminating a memory round-trip. This pattern arises naturally when a compiler generates code for each construct independently: the store comes from one context (writing a variable) and the load comes from another (reading it for the next use), and only local inspection reveals the redundancy.
Question 2 Multiple Choice
Why do compilers often run peephole optimization iteratively (multiple passes) rather than just once?
AEach pass reduces instruction count, which shrinks the code and requires re-running the pass to handle the now-smaller windows correctly
BApplying one peephole rule can bring two previously non-adjacent instructions into adjacency, exposing new pattern-match opportunities that the first pass could not have seen
CIterative passes compensate for missed patterns caused by the fixed window size being too small on the first pass
DPeephole optimization is non-deterministic; multiple passes increase the probability of finding the globally optimal sequence
Peephole rules compose: applying one substitution can create new opportunities for other rules. For example, collapsing a jump chain (A jumps to L1, L1 jumps to L2 → A jumps directly to L2) might now place two formerly non-adjacent loads to the same address next to each other, which a redundant-load rule can then eliminate. The second pass sees patterns that the first pass could not, because the first pass changed the instruction arrangement. Running until no further changes occur (a fixpoint) guarantees that all discoverable patterns are caught. This composability is one of peephole optimization's practical strengths.
Question 3 True / False
Peephole optimization is a purely local transformation: it can improve instruction sequences within a small window without needing to analyze the program's data flow, control flow, or overall structure.
TTrue
FFalse
Answer: True
This is what makes peephole optimization both simple and broadly applicable. Each rule is a self-contained pattern match over 2–5 adjacent instructions — 'if you see this sequence, replace it with that sequence.' No global program analysis is needed, no data-flow equations are solved, no call graphs are inspected. The optimizer does not need to understand what the program computes. This simplicity means the optimizer is easy to implement and verify, easy to extend with new rules, and easy to apply across different source languages, target architectures, and earlier optimization phases. It is the quintessential 'local polish' pass.
Question 4 True / False
Peephole optimization is designed to run early in the compilation pipeline, before register allocation, because it needs to work on high-level intermediate representations.
TTrue
FFalse
Answer: False
Peephole optimization typically runs late in the compilation pipeline, after instruction selection and register allocation. The reason is that earlier phases sometimes introduce awkward instruction sequences — a register allocator might insert a spill (store to memory) and reload (load back from memory) that turns out to be unnecessary, or instruction selection might produce a two-instruction idiom where a single specialized machine instruction exists. The peephole pass is positioned to catch exactly these late-stage inefficiencies. Running it earlier (on high-level IR) would miss the target-specific patterns that instruction selection and register allocation introduce.
Question 5 Short Answer
What properties of peephole optimization make it well-suited as a 'final polish' pass in a compiler, as opposed to an earlier, more global optimization phase?
Think about your answer, then reveal below.
Model answer: Peephole optimization is suited for a final pass because: (1) it is purely local — it only needs to examine 2–5 adjacent instructions, requiring no global analysis that might be expensive or invalidated by later passes; (2) it operates on the final instruction representation (machine code or near-machine IR), where it can apply target-specific rules like strength reduction (multiply → shift) or specialized instruction folding; (3) it catches inefficiencies introduced by earlier phases — register allocation spills, instruction selection idioms — that only become visible at the instruction level; (4) it is safe to apply after all other optimizations since it only makes local substitutions that preserve semantics by construction.
Earlier optimization phases like inlining, loop optimizations, or register allocation make large structural changes to the program and benefit from global program information. Peephole optimization makes tiny local substitutions and needs no such information. This separation of concerns is intentional: global passes do the heavy lifting, and peephole optimization mops up the local inefficiencies they leave behind. Its simplicity also means it can be extended easily — adding a new target architecture pattern is just adding a new rule to the pattern-match table.