A polling company calls 100,000 people using phone book listings to predict an election. A competitor polls 1,000 people selected by random digit dialing. Which poll is likely more accurate?
AThe 100,000-person poll — larger samples always produce more accurate estimates
BThe 1,000-person poll — random sampling eliminates the systematic bias that a phone-book list introduces
CThey are equally accurate — sample size is the only thing that matters for accuracy
DThe 100,000-person poll — once a sample is large enough, any sampling method works
This mirrors the famous 1936 Literary Digest poll, which surveyed 2.4 million people but still predicted the wrong winner because its list systematically over-represented wealthier, Republican-leaning households. A biased sampling frame produces systematically wrong estimates no matter how large the sample — you are just measuring the bias more precisely. Random sampling gives every member of the population an equal chance of selection, so the sample's composition reflects the population's. Size amplifies the quality of the method; it cannot fix a flawed one.
Question 2 Multiple Choice
A researcher estimates the average American's height by surveying only members of an NBA fan forum. What is the primary flaw in this approach?
ASampling error — the natural variation that occurs between any sample and the population
BSelection bias — systematically over-representing a non-representative subgroup of the population
CConfounding — a third variable is interfering with the height measurements
DRandom error — unpredictable fluctuations in the measurement instrument
Selection bias occurs when the mechanism used to select the sample systematically excludes or over-represents certain groups. NBA fans skew male, younger, and — because NBA players are exceptionally tall — may be taller-than-average themselves. The sample is not drawn from the full population; it is drawn from a self-selected subset. This is distinct from sampling error, which is the unavoidable random variation even in a perfect random sample. Sampling error shrinks as n grows; selection bias does not — it is baked into the sampling method.
Question 3 True / False
A sample of 10,000 people is generally more representative of a population than a sample of 500 people.
TTrue
FFalse
Answer: False
Sample size does not guarantee representativeness — the sampling method does. A 10,000-person sample drawn from phone book listings in one city will be less representative of the national population than a 500-person nationally stratified random sample. The Literary Digest's 2.4-million-person sample failed to predict FDR's landslide because of systematic selection bias. Bigger is better only when the sampling method is sound; more observations from a biased sample just give you a more precise estimate of the wrong thing.
Question 4 True / False
Even a perfectly random sample will not exactly match the population — some difference between the sample statistic and the population parameter is always expected.
TTrue
FFalse
Answer: True
This is sampling error (or sampling variability) — the unavoidable random fluctuation between a sample and the population it was drawn from. Even with perfect random sampling, you're observing a subset, and chance determines which individuals are selected. The sample mean x̄ will rarely equal the population mean μ exactly. This is why statistical inference exists: to quantify how much x̄ might differ from μ by chance, using tools like standard errors and confidence intervals. Sampling error shrinks as sample size grows, but it never reaches zero.
Question 5 Short Answer
Why can a large biased sample lead to worse conclusions than a small random sample? Use a concrete example to illustrate.
Think about your answer, then reveal below.
Model answer: A biased sample systematically misrepresents the population in a specific direction. More observations from a biased sample just reinforce the same distortion with more false precision. A small random sample, by contrast, gives every population member an equal chance of selection, so it reflects the population's true diversity. Example: a 100,000-person survey of only urban residents would systematically underestimate rural support for a candidate, while a 500-person random national sample would capture rural and urban voters proportionally.
The key insight is that bias and sampling error are different problems. Sampling error is random — it averages out over many samples and shrinks with size. Bias is systematic — it does not average out and is not reduced by adding more biased observations. When you increase a biased sample, you increase your confidence in a wrong answer. This is why statisticians care so much about how samples are drawn, not just how large they are.