Questions: External Validity and Generalization of Findings
5 questions to test your understanding
Score: 0 / 5
Question 1 Multiple Choice
A study finds that college students in a lab show significantly stronger conformity when given written feedback versus verbal feedback. The design is methodologically flawless — random assignment, no confounds, p < .001. A critic says the findings might not matter much. What is the critic's most likely concern?
AThe statistical analysis was probably done incorrectly
BThe sample of college students and the artificial lab setting may not reflect conformity as it operates in real-world settings with diverse populations
CThe finding lacks internal validity because the researchers cannot truly isolate the cause
DA statistically significant result always generalizes, so the critic is wrong
The critic is raising an external validity concern. The study may be perfectly internally valid — we can trust the causal inference within the study — but population validity (WEIRD undergraduate sample) and ecological validity (lab setting with artificial tasks) both limit generalization. Option C is wrong because the question says the design is flawless, implying good internal validity. Option D reflects a common misconception: statistical significance speaks to whether an effect is real in the sample studied, not whether it generalizes.
Question 2 Multiple Choice
A researcher wants to study whether a new therapy reduces anxiety. She runs a tightly controlled randomized trial with strict inclusion criteria, standardized sessions, and weekly assessments. What trade-off has she made?
AShe has maximized external validity at the cost of internal validity
BShe has maximized internal validity but may have reduced ecological validity — real therapy is messier and delivered to more varied patients
CRandomized trials have neither internal nor external validity advantages
DStrict inclusion criteria improve both internal and external validity equally
Tight experimental controls (random assignment, standardized sessions, strict inclusion criteria) strengthen causal inference — that's internal validity. But those same features reduce ecological validity: real-world therapy patients are more diverse, sessions vary, and delivery is less controlled. A perfectly controlled efficacy trial may overestimate real-world effectiveness. This is why health researchers distinguish 'efficacy' (can it work under ideal conditions?) from 'effectiveness' (does it work in practice?). Strict inclusion criteria specifically worsen population validity by excluding atypical cases.
Question 3 True / False
A study with high internal validity automatically has high external validity.
TTrue
FFalse
Answer: False
Internal and external validity are independent dimensions. Internal validity asks whether the design supports a causal inference within the study. External validity asks whether that finding generalizes beyond the study's specific participants, setting, and time. The moves that maximize internal validity — tight experimental control, laboratory setting, homogeneous samples, standardized stimuli — often reduce generalizability. A study can be a near-perfect causal demonstration in a narrow lab context but tell us very little about how the phenomenon operates in the real world.
Question 4 True / False
The WEIRD acronym (Western, Educated, Industrialized, Rich, Democratic) identifies a threat to population validity because psychology studies have historically over-relied on samples from these populations.
TTrue
FFalse
Answer: True
The WEIRD critique, popularized by Henrich et al. (2010), pointed out that psychology drew disproportionately on Western undergraduate samples and then generalized findings to 'humans.' Findings including visual illusions (Müller-Lyer), conformity, and basic cognitive phenomena have been shown to vary significantly across cultures. This makes WEIRD sampling a genuine threat to population validity — the claim that findings generalize to all people — since the sample systematically underrepresents most of humanity.
Question 5 Short Answer
Why does increasing experimental control often reduce external validity, and how do researchers navigate this tension when designing studies?
Think about your answer, then reveal below.
Model answer: Experimental control introduces artificiality. Using standardized stimuli, laboratory settings, and narrow participant criteria reduces the variation present in the real world, which is precisely what allows clean causal inference — but that clean, artificial context may not resemble the environments where the phenomenon naturally occurs. Researchers navigate this by matching design to purpose: tight experiments for establishing whether an effect exists at all; field research, diverse samples, and multiple replications for establishing that the effect generalizes. The key insight is that neither design type is superior — they answer different questions.
This tension is why replication across different labs, populations, and settings is the scientific community's main tool for establishing external validity. A single internally-valid study is a starting point, not a conclusion. The reproducibility crisis reminded psychologists that p < .05 in one controlled study is only the beginning of the evidential story — generalization requires the accumulation of evidence across varied contexts.