Questions: Query Cardinality and Selectivity Estimation

5 questions to test your understanding

Score: 0 / 5
Question 1 Multiple Choice

A query optimizer estimates that WHERE age > 30 AND city = 'Denver' will return 3,000 rows (using independence: selectivity 0.6 × 0.05 on a 100,000-row table). The actual result is 200 rows because age and city are correlated. What plan failure is most likely?

AThe optimizer will generate a syntax error because correlated columns require special syntax
BThe optimizer will skip the city filter entirely to reduce computation
CThe optimizer will choose a join algorithm and scan strategy suited for 3,000 rows — likely a hash join or full scan — when a nested-loop join on the actual 200-row result would be far faster
DThe optimizer will correctly detect the correlation and rewrite the query
Question 2 Multiple Choice

A column has 200 distinct values uniformly distributed across a 1,000,000-row table. What selectivity does the optimizer assign to an equality condition on this column?

A0.5 — equality conditions match about half the rows on average
B0.005 — selectivity equals 1/NDV (number of distinct values) = 1/200
C1.0 — every row matches some value, so selectivity is 1
DUnknown — selectivity cannot be calculated without a histogram
Question 3 True / False

When estimating the combined selectivity of multiple filter conditions (e.g., WHERE a = 1 AND b = 2), databases multiply the individual selectivities together. This is generally accurate.

TTrue
FFalse
Question 4 True / False

A single large cardinality estimation error at an early stage of a query execution plan can cause dramatically poor performance for the entire query, including join strategy and index selection decisions downstream.

TTrue
FFalse
Question 5 Short Answer

When diagnosing an unexpectedly slow SQL query, why is comparing 'estimated rows' to 'actual rows' in the query execution plan often the fastest path to finding the root cause?

Think about your answer, then reveal below.