Questions: Populations, Sampling Methods, and Representativeness
5 questions to test your understanding
Score: 0 / 5
Question 1 Multiple Choice
The 1936 Literary Digest poll surveyed 10 million people but predicted the wrong presidential winner by a wide margin. What was the primary cause of this failure?
AThe sample was too small to detect the true preferences of the electorate
BThe sampling frame (telephone and car registration lists) systematically oversampled wealthier, Republican-leaning voters
CThe poll was conducted too far in advance of the election
DThe questions used were ambiguously worded, confusing respondents
The Literary Digest failure is a textbook case of sampling bias, not sample size. With 10 million respondents, size was not the problem. The sampling frame — people with telephones and cars in 1936 — systematically excluded lower-income voters who favored Roosevelt. The bias in *who* was selected overwhelmed the precision gained from large numbers. This illustrates the core principle: representativeness is about selection method, not volume.
Question 2 Multiple Choice
A researcher wants to estimate the average anxiety level of adults in a large city. Which approach provides the strongest statistical basis for inference to the full population?
ASurvey 5,000 volunteers who respond to a public Facebook post
BSurvey every patient at the city's three largest mental health clinics
CDraw a simple random sample of 800 adults from city registration records
DSurvey 10,000 university students at local campuses
Simple random sampling from a complete sampling frame gives every adult a known, equal probability of inclusion — the condition required for standard inferential statistics to be valid. Options A and D are convenience samples with systematic biases (social media users, students). Option B oversamples people already seeking mental health treatment, severely biasing the estimate upward. A smaller random sample (800) outperforms a larger biased sample (5,000 or 10,000) for accurate population inference.
Question 3 True / False
A convenience sample of 10,000 participants is necessarily more representative of the population than a random sample of 1,000 participants.
TTrue
FFalse
Answer: False
Representativeness depends on *how* participants are selected, not *how many* are selected. A large convenience sample can be systematically biased — overrepresenting certain demographic groups — in ways that a smaller random sample avoids. The Literary Digest poll exemplifies this: 10 million biased respondents produced a worse estimate than a well-designed smaller random sample. Sample size increases precision (reduces sampling error) only if the sample is unbiased in the first place.
Question 4 True / False
Non-probability sampling methods (e.g., convenience samples, purposive samples) can be appropriate and valid for some research purposes.
TTrue
FFalse
Answer: True
Non-probability samples are not inherently invalid — they are inappropriate for making precise population-level statistical inferences, but they serve many legitimate research purposes. Exploratory research, hypothesis generation, studies of rare or hard-to-reach populations, qualitative research, and initial pilot testing often rely on purposive or convenience sampling. The key is honesty: results should be explicitly bounded to the sample or similar populations, rather than overgeneralized to groups not represented in the sample.
Question 5 Short Answer
Why can a well-drawn random sample of 1,000 people produce more accurate population estimates than a convenience sample of 100,000 people?
Think about your answer, then reveal below.
Model answer: A random sample gives every member of the population a known, equal chance of selection, so the resulting sample's statistics are unbiased estimates of population parameters. A convenience sample systematically over- or under-represents certain groups, introducing bias that does not shrink as sample size grows — more observations from a biased process just give a more precise estimate of the wrong thing.
Statistical inference theory assumes random selection: confidence intervals, significance tests, and sampling distributions are derived under the assumption that each observation is drawn randomly from the population. A convenience sample violates this assumption, so the inferential machinery doesn't apply. Large biased samples are like measuring with a miscalibrated ruler many times — you get more precise measurements, but they're all wrong in the same systematic direction. Randomization ensures the errors are unsystematic and average out.