Questions: Propensity Score Methods and Estimation
5 questions to test your understanding
Score: 0 / 5
Question 1 Multiple Choice
A researcher estimates the effect of a job training program using propensity score matching. After matching, treated and control groups are nearly identical on all 12 measured covariates. She reports an unbiased causal estimate. What critical assumption is implicit in this claim?
AThe propensity score model must be correctly specified with no omitted interactions
BUnconfoundedness: all variables that affect both treatment selection and outcomes have been measured and included
CThe sample must be large enough that the law of large numbers guarantees balance on unmeasured variables
DThe outcome model must be linear for propensity score estimates to be consistent
Good observed covariate balance after matching is necessary but not sufficient for an unbiased causal estimate. The unconfoundedness assumption — also called 'selection on observables' or 'conditional independence' — requires that every variable influencing both treatment assignment and potential outcomes has been observed and included. If an unobserved variable (e.g., motivation, family connections) affects who gets training and what their earnings would be, propensity score matching cannot remove that bias. Observing 12 balanced covariates says nothing about the 13th covariate you didn't measure. This assumption is fundamentally untestable.
Question 2 Multiple Choice
Why does propensity score matching solve the 'curse of dimensionality' that plagues direct matching on many covariates?
AIt selects only the most important covariates and discards the rest, reducing the matching space
BIt replaces the high-dimensional covariate vector with a single scalar — the conditional treatment probability — while preserving covariate balance by the balancing property
CIt uses a nearest-neighbor algorithm that scales efficiently in high dimensions
DIt approximates direct matching but doesn't actually solve dimensionality — it just makes the bias more manageable
The Rosenbaum-Rubin (1983) balancing property is the key result: conditional on the propensity score p(X), the distribution of covariates X is the same in treated and control groups. This means matching on one number — the propensity score — achieves the same covariate balance as matching on all the underlying covariates simultaneously. The curse of dimensionality arises because exact matches become impossible as the number of covariates grows; collapsing X to a scalar solves this elegantly without discarding any covariates from the estimation.
Question 3 True / False
After propensity score matching produces excellent covariate balance on most observed variables, the estimated treatment effect is expected to be unbiased.
TTrue
FFalse
Answer: False
False. Propensity score methods can only balance on observed covariates. The unconfoundedness assumption — which is required for unbiasedness — asserts that no unobserved variable affects both treatment selection and outcomes. This assumption is untestable from the data; good observed balance is consistent with both confounded and unconfounded identification. Sensitivity analysis tools (such as Rosenbaum bounds) can quantify how large an unobserved confounder would need to be to reverse the conclusion, but they cannot prove the assumption holds.
Question 4 True / False
Propensity scores are estimated by regressing the outcome variable on observed covariates using logistic regression.
TTrue
FFalse
Answer: False
False. The propensity score is the estimated probability of receiving treatment — so the dependent variable in the logistic regression is treatment status (D = 1 if treated, D = 0 if control), not the outcome. The covariates X are the predictors. Regressing the outcome on covariates produces a predictive model for outcomes, which is a different object entirely. This confusion is common because the outcome regression and the propensity score model both use the same covariate set X but serve completely different roles.
Question 5 Short Answer
Why does covariate balance on observed variables — even perfect balance — not guarantee that propensity score estimates are free from omitted-variable bias?
Think about your answer, then reveal below.
Model answer: Propensity score methods condition on observed covariates to make treatment assignment 'as good as random' within matched groups. But this only eliminates selection bias caused by the observed variables you included. If an unobserved variable — say, innate ability, family wealth, or a physician's judgment — affects both who receives treatment and what outcomes they would experience, that selection bias remains in the estimate regardless of how well the observed covariates are balanced. The unconfoundedness assumption requires that the potential outcomes be independent of treatment assignment conditional on all confounders, but if some confounders are unobserved, this condition cannot be verified from the data.
This is the fundamental limitation of all observational study methods that rely on selection on observables. Randomized experiments solve this by design — random assignment guarantees that even unobserved confounders are distributed equally across groups in expectation. Propensity scores are a substitute for randomization when it is unavailable, but they only mimic the effects of randomization on observed covariates. The researcher must rely on domain knowledge to argue that the measured covariates are sufficient to explain all selection — an argument the data alone cannot settle.