Questions: Query Execution Plans and EXPLAIN Analysis
5 questions to test your understanding
Score: 0 / 5
Question 1 Multiple Choice
EXPLAIN ANALYZE shows the optimizer estimated 10 rows for a table scan, but the actual count was 85,000. What is the most likely root cause and fix?
AThe query is missing an index; add one on the filtered column
BThe table statistics are stale; run ANALYZE to refresh them
CThe work_mem setting is too low; increase it to avoid disk spills
DThe join algorithm chosen is incorrect for this query size
A large discrepancy between estimated and actual rows is the signature of stale table statistics. The optimizer uses histograms of column value distributions to estimate cardinality; when those are out of date after a bulk load, estimates can be wildly wrong, causing the optimizer to choose suboptimal join orders and access methods. Running ANALYZE refreshes the statistics, often fixing the plan without any other changes. An index (option A) might also help, but the root cause is that the optimizer doesn't know how many rows actually exist.
Question 2 Multiple Choice
A developer wants to inspect whether a query will use an index without actually executing the query. Which command should they use?
AEXPLAIN ANALYZE — it shows the actual execution plan with real timing
BEXPLAIN — it shows the optimizer's plan without executing the query
CEXPLAIN VERBOSE — it shows extra detail including actual row counts
DEXPLAIN BUFFERS — it shows cache hit statistics for the query
EXPLAIN alone shows the optimizer's chosen plan — whether it uses a Seq Scan or Index Scan — without executing the query. EXPLAIN ANALYZE (option A) actually runs the query and returns real timing alongside estimates, which is valuable for diagnosing row count discrepancies, but it executes the query — undesirable for very slow queries or DML statements. EXPLAIN VERBOSE and BUFFERS are modifiers that still require ANALYZE to get actual rows.
Question 3 True / False
A Seq Scan in EXPLAIN output usually indicates a missing index and should be replaced with an Index Scan.
TTrue
FFalse
Answer: False
Seq Scans are often the correct choice. For small tables, a full scan is cheaper than the overhead of index navigation. For queries that return a large fraction of rows, sequential I/O is more efficient than random index reads. A Seq Scan only suggests a problem when there is a highly selective filter on a large table that should be indexed. EXPLAIN output must be read in context — the question is always 'is this plan appropriate?' not 'is there a Seq Scan?'
Question 4 True / False
EXPLAIN ANALYZE can safely be used to diagnose slow INSERT or DELETE statements without any side effects.
TTrue
FFalse
Answer: False
EXPLAIN ANALYZE actually executes the query, including DML statements. Running it on an INSERT or DELETE will insert or delete the rows. The standard workaround is to wrap the statement in a transaction and roll it back: BEGIN; EXPLAIN ANALYZE INSERT ...; ROLLBACK; Developers who don't know this have accidentally run destructive operations on production data while 'just checking the plan' — a critical practical point.
Question 5 Short Answer
In EXPLAIN output, what does a large discrepancy between estimated rows and actual rows tell you, and why does it matter for query performance?
Think about your answer, then reveal below.
Model answer: A large estimated-vs-actual row discrepancy means the optimizer's table statistics are stale or inaccurate. It matters because the optimizer uses row count estimates to choose join order, join algorithm, and access method — if estimates are wrong, it may choose a Nested Loop join expecting a tiny inner table but encounter millions of rows, causing catastrophically slow performance. Refreshing statistics with ANALYZE often corrects the plan automatically.
The estimated row count is the foundation of the cost model. Every cost calculation in the plan tree depends on how many rows each operation processes. A 10x or 100x estimation error compounds through the plan tree, causing systematically wrong decisions. This is why the row estimate discrepancy — not the absolute query time — is the first thing to diagnose when a plan looks wrong.