Questions — Table Statistics, Histograms, and Column Statistics

Question 1 Multiple Choice

A 'city' column in a 1,000,000-row table has 500 distinct values and contains 'New York' for 40% of rows. The optimizer assumes uniform distribution. How will it estimate the row count for WHERE city = 'New York'?

A400,000 rows — it recognizes 'New York' as the dominant value

B2,000 rows — dividing total rows by the number of distinct values

C1,000,000 rows — it defaults to full table scan estimates

D0 rows — no statistics are available without an explicit ANALYZE

Question 2 Multiple Choice

For a highly skewed column where a few values account for most rows, an equi-depth histogram provides better selectivity estimates than an equi-width histogram. Why?

AEqui-depth uses less memory, leaving room for more precise per-value statistics

BEqui-depth adjusts bucket boundaries so popular ranges get narrower buckets with more precise estimates, concentrating precision where data is dense

CEqui-depth captures the exact frequencies of the most common values, eliminating estimation error for those values

DEqui-depth requires no maintenance after data changes, while equi-width must be rebuilt after every insert

Question 3 True / False

When a query plan suddenly degrades after a large bulk insert, stale statistics are one of the first things worth checking.

TTrue

FFalse

Question 4 True / False

Collecting exact column statistics in a database requires mainly a brief metadata lookup — the system tracks distributions automatically without scanning the table.

TTrue

FFalse

Question 5 Short Answer

Why do databases use sampling rather than full-table scans to build statistics, and what is the key operational risk of this approach?

Think about your answer, then reveal below.

Questions: Table Statistics, Histograms, and Column Statistics