A researcher estimates returns to job training by comparing wages of program completers to non-participants. Participants were more motivated and job-ready before the program. What is the problem with this estimate?
AIt underestimates the treatment effect because participants are harder to train
BIt overestimates the treatment effect because participants had higher baseline wages even without training
CIt is unbiased as long as the researcher controls for age and education
DIt is valid because the comparison uses the same time period
The naive estimator confounds the training effect with pre-existing differences. More motivated workers would earn more even without training — E[Y(0)|D=1] > E[Y(0)|D=0] — so the control group underrepresents what participants would have earned absent treatment. The bias is positive: the naive estimate overstates the causal effect. Controlling for observable characteristics (option C) only helps if motivation is fully captured by those observables — in practice, motivation is typically unobserved.
Question 2 Multiple Choice
A researcher matches treated and control units on age, education, and prior employment, achieving covariate balance. This guarantees the treatment effect estimate is free of selection bias.
ATrue — matching on all relevant covariates removes all forms of selection bias
BFalse — matching only balances observed covariates; unobserved differences may remain
CTrue — as long as the matched groups are large enough, unobserved differences cancel out
DFalse — matching never reduces bias; only random assignment can
Matching balances the distribution of observed covariates between treated and control groups, addressing selection on observables. But it does nothing for selection on unobservables — unmeasured differences (motivation, ambition, health) may still differ systematically between groups. A well-matched study can still have severe selection bias if an important confounder is unmeasured. This is why causal inference often requires IV, DiD, or RD when selection on unobservables is plausible.
Question 3 True / False
Selection bias occurs primarily when researchers use data collected non-randomly; using large datasets eliminates the problem.
TTrue
FFalse
Answer: False
False. Selection bias is about the mechanism by which units enter treatment, not the size of the dataset. A massive observational dataset can have severe selection bias if those who choose treatment differ systematically from those who don't. The solution is not more data but an identification strategy that addresses how units self-selected into treatment — random assignment, an instrument, a discontinuity, or differencing.
Question 4 True / False
Positive selection bias causes the naive treatment effect estimator to overstate the true causal effect.
TTrue
FFalse
Answer: True
True. Formally, naive estimator = ATT + selection bias, where selection bias = E[Y(0)|D=1] − E[Y(0)|D=0]. Positive selection means the treated group would have had better outcomes even without treatment, so the selection bias term is positive and the naive estimate exceeds the true ATT. The job training example illustrates this: more motivated workers earn more even without training, inflating the apparent program effect.
Question 5 Short Answer
Explain the difference between selection on observables and selection on unobservables, and why the distinction determines which identification strategy is appropriate.
Think about your answer, then reveal below.
Model answer: Selection on observables means that conditional on measured covariates X, treatment assignment is independent of potential outcomes — all confounders are captured in X. Controlling for X via regression or matching recovers an unbiased estimate. Selection on unobservables means treated and control groups differ in unmeasured ways that also affect outcomes. No amount of conditioning on observed variables fixes this — you need a quasi-experimental strategy: instrumental variables exploit external variation in treatment assignment; difference-in-differences removes fixed group-level confounders; regression discontinuity exploits threshold-based assignment.
The distinction determines what tools can credibly identify a causal effect. If all confounders are measured, the conditional independence assumption is plausible and standard methods work. If important confounders are unobserved, the treatment variable remains endogenous even after conditioning, and you need a strategy that creates plausibly exogenous variation in treatment. This is the core challenge of observational causal inference.