A researcher wants to estimate average daily screen time for teenagers nationally. She posts a survey link on social media and gets 50,000 responses. A colleague runs a simple random sample of 400 teenagers from a national registry. Whose result should you trust more?
AThe 50,000-response survey — larger samples are always more accurate
BThe 400-person random sample — randomization, not size, determines validity
CThey are equally valid — both capture real responses from real teenagers
DThe 50,000 survey — self-selection introduces healthy diversity of perspectives
The 50,000-response survey is a voluntary response sample: only people motivated enough to click and respond are included, systematically over-representing high-screen-time users who may be more online. This bias cannot be reduced by collecting more responses — it just yields a more confident wrong answer. The 400-person SRS gives every teenager an equal chance of selection, making it an unbiased representation of the population. This is the core lesson of the 1936 Literary Digest poll, which polled millions yet wrongly predicted the election outcome.
Question 2 Multiple Choice
A researcher studying income by region divides the US into four geographic quadrants and draws a separate random sample from each. A second researcher randomly selects 50 city blocks nationwide and surveys every household in each selected block. Which methods are these, respectively?
AStratified sampling; cluster sampling
BCluster sampling; stratified sampling
CSystematic sampling; simple random sampling
DStratified sampling; systematic sampling
Stratified sampling divides the population into mutually exclusive subgroups (strata) and draws a separate random sample from each — here, the four geographic quadrants. Cluster sampling divides the population into clusters, randomly selects entire clusters, and studies every individual within them — here, the 50 city blocks (all households surveyed within selected blocks). The key distinction: in stratified sampling you sample *within* every stratum; in cluster sampling you select entire clusters and skip all others.
Question 3 True / False
Increasing the size of a convenience sample will eventually eliminate sampling bias if the sample is large enough.
TTrue
FFalse
Answer: False
Bias from non-probability sampling cannot be corrected by increasing sample size. A larger convenience or voluntary-response sample just produces a more statistically precise estimate of the wrong quantity. The mechanism of bias — systematic over- or under-representation of certain groups — is unaffected by n. Only switching to a probability-based sampling method (where every member of the population has a known, nonzero chance of selection) eliminates bias.
Question 4 True / False
Simple random sampling is the theoretical gold standard for inference because every individual in the population has an equal probability of being selected.
TTrue
FFalse
Answer: True
This is precisely what makes SRS the foundation of sampling theory. Equal probability of selection (and, more precisely, equal probability for every possible sample of size n) guarantees that the sample is an unbiased representation of the population — the expected value of any sample statistic equals the corresponding population parameter. All other probability sampling methods (stratified, cluster, systematic) trade off some of this theoretical purity for practical benefits like reduced variance or cost.
Question 5 Short Answer
Why is a large biased sample potentially worse than a small random one — not just less accurate, but actively worse?
Think about your answer, then reveal below.
Model answer: A large biased sample produces a precise estimate of the wrong quantity, giving false confidence in an incorrect conclusion. The statistical precision (narrow confidence interval) makes the wrong answer look reliable, so decision-makers are more likely to act on it. A small random sample is honest about its uncertainty — its wide confidence interval signals that we don't know much. The biased sample is misleading in a way the small random sample is not.
Precision and accuracy are distinct properties. Precision refers to how consistent or tightly clustered estimates are; accuracy refers to whether they're centered on the truth. A large biased sample can be highly precise (low variance from repetition) but systematically inaccurate. The danger is that standard error calculations assume randomness — applied to a biased sample, they give artificially narrow intervals that don't account for the bias at all. You end up confidently wrong rather than usefully uncertain.