A topic in the Open Knowledge Graph — a free, open map of 15,290 topics and the order to learn them in.

Query Execution Plans and EXPLAIN Analysis

College Depth 89 in the knowledge graph ☐ I know this ☆ Set as goal

491prerequisites beneath it

Core Idea

The EXPLAIN statement displays the optimizer's chosen execution plan, showing operations (Seq Scan, Index Scan, Join) with estimated row counts, costs, and timing. Analyzing EXPLAIN output reveals whether the optimizer made good decisions and identifies bottlenecks like full table scans or inefficient joins. Discrepancies between estimated and actual row counts indicate poor statistics. Understanding plan interpretation is essential for query tuning.

Explainer

You already know that query optimizers consider multiple execution plans and pick the cheapest one based on cost estimates. The EXPLAIN statement lets you see the plan the optimizer actually chose — it is your window into the database engine's decision-making. In PostgreSQL, running `EXPLAIN` before a query prints a tree of operations; adding `EXPLAIN ANALYZE` actually executes the query and reports real timing alongside the estimates, so you can compare what the optimizer predicted with what actually happened.

The output is a plan tree read from the inside out. Each node represents an operation — a Seq Scan (reading every row in a table), an Index Scan (jumping directly to matching rows via an index), a Nested Loop or Hash Join (combining two tables), or a Sort (ordering results). Each node shows an estimated cost (in arbitrary units combining I/O and CPU), the estimated rows it expects to produce, and the width of each row in bytes. When you run EXPLAIN ANALYZE, you also see actual time (in milliseconds) and actual rows. The gap between estimated and actual rows is the single most diagnostic number in the output.

When estimated rows are close to actual rows, the optimizer is making informed decisions and the plan is likely reasonable. When they diverge sharply — the optimizer expected 10 rows but got 100,000 — the plan is almost certainly wrong. This happens because the optimizer relies on table statistics (histograms of column value distributions), and those statistics can go stale after bulk inserts or deletes. Running `ANALYZE` on the table refreshes them. A common pattern: a slow query shows a Nested Loop join where the optimizer expected a tiny inner table, but the actual row count is enormous. Switching to a Hash Join or Merge Join would be far better, and refreshing statistics often causes the optimizer to make that switch on its own.

Reading EXPLAIN output is a skill built through repetition. Start with simple single-table queries: is it doing a Seq Scan when an index exists? That might mean the table is small enough that a sequential scan is genuinely cheaper, or it might mean the WHERE clause doesn't match any index. Then move to joins: check the join algorithm (Nested Loop is fine for small inner tables, Hash Join for larger ones, Merge Join for pre-sorted data) and verify the join order makes sense. Finally, look for sort operations that spill to disk — the `Sort Method: external merge` line means the data exceeded work_mem and performance dropped significantly.

The practical workflow is: identify a slow query, run EXPLAIN ANALYZE, find the node with the highest actual time or the biggest estimated-vs-actual row mismatch, then address that node — whether by adding an index, rewriting the query, updating statistics, or increasing work_mem. EXPLAIN does not change anything; it only reveals what the database is doing. That visibility is what transforms query tuning from guesswork into engineering.

Practice Questions 5 questions

Prerequisite Chain

Understanding Zero → The Number Zero → Counting to Five → Counting to 10 → Counting to 20 → Counting a Set of Objects Up to 20 → Cardinality: The Last Number Counted → Matching Numerals to Quantities → Subitizing Small Quantities → Addition Within 10 → Number Bonds to 10 → Addition Within 20 → Doubles and Near Doubles → Doubles Facts Within 10 → Near Doubles Facts Within 20 → Mental Math Strategies for Addition → Mental Math: Adding and Subtracting Tens → Addition Within 100 → Repeated Addition as Multiplication → Multiplication as Equal Groups → Multiplication: Arrays → Basic Multiplication Facts (0s, 1s, 2s, 5s, 10s) → Multiplication Facts Within 100 → Division as Equal Sharing → Division as Grouping (Measurement Division) → Division: Grouping (Repeated Subtraction) Model → Division: Fair Sharing Model → Division as Equal Sharing → Division as Grouping → Basic Division Facts → Division Facts Within 100 → Multiplication and Division Fact Families → Relationship Between Multiplication and Division → Division Facts as Inverse of Multiplication → Remainders and Quotients in Division → Division Word Problems → Multi-Step Word Problems → Solving Multi-Step Word Problems → Multiplication Word Problems → Division Word Problems → Introduction to Long Division → Factors and Multiples → Prime and Composite Numbers → Equivalent Fractions → Relating Fractions and Decimals → Decimal Place Value → Integers and the Number Line → Comparing and Ordering Integers → Absolute Value → Adding Integers → Subtracting Integers → Multiplying Integers → Introduction to Exponents → Order of Operations → Integer Order of Operations → Variable Expressions → The Distributive Property → Variables and Expressions Review → Introduction to Polynomials → Adding and Subtracting Polynomials → Multiplying Polynomials → Factorial → Permutations → Combinations → Counting Principles: Addition and Multiplication Rules → Introduction to Graph Theory → Propositional Logic Foundations → Logical Equivalences → Boolean Algebra → Boolean Type and Truth Values → Comparison Operators and Boolean Tests → Logical Operators and Boolean Algebra → Conditional Statements → Defining and Calling Functions → Functions: Decomposing Problems → Function Parameters and Argument Passing → Return Values → Variable Scope → Introduction to Classes → Objects and Instances → Methods and Attributes → Algorithm Design Basics → Asymptotic Notation: Big-O, Big-Omega, Big-Theta → Big-O Notation and Complexity Analysis → Time and Space Complexity → Binary Search → Binary Search Trees → B-Tree Indexes → Query Optimization → Query Execution Plans and EXPLAIN Analysis

Longest path: 90 steps · 491 total prerequisite topics

Prerequisites (1)

Query Optimizationhard

Leads To (0)

No topics depend on this one yet.