A topic in the Open Knowledge Graph — a free, open map of 15,290 topics and the order to learn them in.

Instruction Selection Techniques

Graduate Depth 105 in the knowledge graph ☐ I know this ☆ Set as goal

551prerequisites beneath it

Code Generation from IR Array Subscript Optimization +1 more→

Core Idea

Instruction selection translates intermediate code into target machine instructions. One IR operation may correspond to many possible machine instructions, each with different costs and constraints. Pattern matching or dynamic programming finds good instruction sequences.

How It's Best Learned

Implement pattern-based instruction selection for a real ISA subset. Write patterns as tree rules and test on realistic code.

Explainer

After the compiler's front end and middle end have parsed, type-checked, and optimized the program, the code generation phase must translate the compiler's intermediate representation into actual machine instructions. You already know from studying code generation that this involves mapping IR operations to target architecture instructions. But this mapping is not one-to-one: a single IR operation like "add a variable to a constant" might be implementable by several different machine instructions, each with different costs, register constraints, and addressing modes. Instruction selection is the process of choosing which machine instructions to emit, and choosing well can significantly affect the speed and size of the generated code.

The simplest approach is macro expansion: each IR instruction maps to a fixed template of machine instructions. An IR add becomes a machine ADD, an IR load becomes a machine LOAD, and so on. This is easy to implement but produces poor code because it cannot exploit complex instructions that combine multiple operations. For example, many architectures have a "load-and-add" instruction that loads a value from memory and adds it to a register in one step. Macro expansion would emit a separate load followed by a separate add, missing the opportunity to use the combined instruction that is faster and more compact.

Tree pattern matching is the standard technique for better instruction selection. The compiler represents each IR expression as a tree — an addition node with two children, one of which might be a memory load. Machine instructions are described as tree patterns: each pattern covers a subtree of the IR and specifies the machine instruction that implements it. A pattern for "load-and-add" covers a tree with an add node whose right child is a load node. The instruction selector finds a set of non-overlapping patterns that tiles the entire IR tree with minimum total cost. This is essentially a covering problem: which combination of patterns covers every node in the tree at the lowest cost?

For tree-shaped IR, dynamic programming solves this optimally. The algorithm works bottom-up: at each node, it considers every pattern whose root matches that node, computes the cost as the pattern's own cost plus the optimal costs of the subtrees not covered by the pattern, and selects the minimum. This produces an optimal tiling in linear time with respect to the tree size. When the IR is a DAG (directed acyclic graph) rather than a tree — because common subexpressions share nodes — the problem becomes NP-hard in general, but heuristics like decomposing the DAG into trees or using greedy selection work well in practice. The quality of instruction selection depends heavily on having a comprehensive set of patterns that exploit the target architecture's instruction set, which is why compiler backends for complex architectures like x86 contain thousands of selection rules.

Practice Questions 5 questions

Prerequisite Chain

Understanding Zero → The Number Zero → Counting to Five → Counting to 10 → Counting to 20 → Counting a Set of Objects Up to 20 → Cardinality: The Last Number Counted → Matching Numerals to Quantities → Subitizing Small Quantities → Addition Within 10 → Number Bonds to 10 → Addition Within 20 → Doubles and Near Doubles → Doubles Facts Within 10 → Near Doubles Facts Within 20 → Mental Math Strategies for Addition → Mental Math: Adding and Subtracting Tens → Addition Within 100 → Repeated Addition as Multiplication → Multiplication as Equal Groups → Multiplication: Arrays → Basic Multiplication Facts (0s, 1s, 2s, 5s, 10s) → Multiplication Facts Within 100 → Division as Equal Sharing → Division as Grouping (Measurement Division) → Division: Grouping (Repeated Subtraction) Model → Division: Fair Sharing Model → Division as Equal Sharing → Division as Grouping → Basic Division Facts → Division Facts Within 100 → Multiplication and Division Fact Families → Relationship Between Multiplication and Division → Division Facts as Inverse of Multiplication → Remainders and Quotients in Division → Division Word Problems → Multi-Step Word Problems → Solving Multi-Step Word Problems → Multiplication Word Problems → Division Word Problems → Introduction to Long Division → Factors and Multiples → Prime and Composite Numbers → Equivalent Fractions → Relating Fractions and Decimals → Decimal Place Value → Integers and the Number Line → Comparing and Ordering Integers → Absolute Value → Adding Integers → Subtracting Integers → Multiplying Integers → Introduction to Exponents → Order of Operations → Integer Order of Operations → Variable Expressions → The Distributive Property → Variables and Expressions Review → Introduction to Polynomials → Adding and Subtracting Polynomials → Multiplying Polynomials → Factorial → Permutations → Combinations → Counting Principles: Addition and Multiplication Rules → Introduction to Graph Theory → Propositional Logic Foundations → Logical Equivalences → Boolean Algebra → Boolean Type and Truth Values → Comparison Operators and Boolean Tests → Logical Operators and Boolean Algebra → Boolean Algebra and Fundamental Laws → Logic Gates Fundamentals → Implementing Boolean Functions with Gates → Karnaugh Map Simplification → Combinational Circuit Design → Flip-Flops and Latches → Finite State Machines (FSMs) → Deterministic Finite Automata (DFA) → Nondeterministic Finite Automata (NFA) → Two-Way Finite Automata → NFA to DFA Conversion (Subset Construction) → DFA Properties and Minimization Algorithms → Regular Languages: Definition and Characterization → Context-Free Grammars (CFGs) → Context-Free Grammar Properties and Ambiguity → Parse Trees, Derivations, and Ambiguity in CFGs → Context-Free Grammars in Compiler Design → Abstract Syntax Trees (ASTs) → Symbol Tables and Scope Resolution → Semantic Analysis Phase → Intermediate Code Representation → Control Flow Graphs → Fixpoint Computation and Iteration → Dataflow Analysis → Reaching Definitions Analysis → Common Subexpression Elimination (CSE) → Dead Code Elimination → Code Optimization Fundamentals → Vectorization and SIMD Code Generation → Loop Invariant Code Motion (LICM) → Loop Unrolling → Loop Detection and Analysis → Array Subscript Optimization → Instruction Selection Techniques

Longest path: 106 steps · 551 total prerequisite topics

Prerequisites (3)

Code Generation from IRhard Procedure Inlining Optimizationsoft Array Subscript Optimizationsoft

Leads To (0)

No topics depend on this one yet.