Questions: Matching Estimators: Nearest Neighbor and Kernel Methods
5 questions to test your understanding
Score: 0 / 5
Question 1 Multiple Choice
A researcher uses nearest-neighbor matching on age, education, and prior earnings to estimate the effect of a job training program. What assumption is required for this estimate to be causally valid?
AThe matched pairs must be exactly identical on all covariates — any distance in covariate space invalidates the comparison
BConditional on age, education, and prior earnings, assignment to the training program is as good as random — no unobserved variables jointly determine selection and outcomes
CThe training program must have been randomly assigned before the matching procedure was applied
DThe outcome variable must be uncorrelated with all measured covariates
Matching estimators are nonparametric — they avoid functional form assumptions — but they still require the conditional independence assumption (CIA): once you control for observed covariates, treatment assignment is as good as random. If there are unobserved confounders (e.g., motivation) that predict both who seeks training and later earnings, matching on observables cannot remove that bias. Matching is not a substitute for randomization; it is a way to approximate randomization when selection depends only on observed variables.
Question 2 Multiple Choice
Compared to narrow-bandwidth kernel matching, wide-bandwidth kernel matching will tend to:
ADecrease both bias and variance — more data is always better
BDecrease variance (by averaging over more control units) but increase bias (by including control units that are genuinely dissimilar to the treated unit)
CIncrease variance because distant units introduce noise, with no effect on bias
DProduce identical estimates — kernel matching is invariant to bandwidth choice
This is the classic bias-variance tradeoff in nonparametric estimation. A narrow bandwidth uses only very close control units as comparisons — these are genuinely similar (low bias), but there may be few of them (high variance). A wide bandwidth averages over many control units — reducing variance — but includes units with meaningfully different covariate values that may have different potential outcomes, introducing bias. Optimal bandwidth selection balances these two forces, typically by cross-validation or asymptotic bias/variance formulas.
Question 3 True / False
Matching estimators can produce biased treatment effect estimates even when matching is done correctly, if there are unobserved variables that jointly determine both treatment assignment and outcomes.
TTrue
FFalse
Answer: True
This is the core limitation of all matching methods (and of all selection-on-observables strategies). Matching eliminates bias due to observed confounders by constructing comparable treatment and control groups on measured characteristics. But if an unobserved variable — such as motivation, ability, or social connections — influences both who selects into treatment and what outcomes they achieve, the treated and control units are still systematically different in that unobserved dimension. No amount of careful matching on observables can remove bias from unobservables.
Question 4 True / False
Because matching estimators are nonparametric, they require no identifying assumptions about the treatment assignment process — matching automatically produces causal estimates regardless of how units came to be treated.
TTrue
FFalse
Answer: False
Nonparametric refers to not assuming a functional form for the outcome equation — not to freedom from identifying assumptions. Matching estimators still require the conditional independence assumption (CIA): given the observed covariates, treatment is as good as random. They also require the overlap (common support) condition: for every treated unit, comparable control units must exist. These identifying assumptions are the same as in propensity score matching; what matching estimators add is robustness to misspecification of the outcome model, not elimination of the need for a credible identification strategy.
Question 5 Short Answer
What is the common support (overlap) requirement in matching, and what problem arises when it fails?
Think about your answer, then reveal below.
Model answer: Common support requires that for every treated unit, there exist control units with similar covariate values — the covariate distributions of treated and control groups must overlap. When common support fails, some treated units have no genuine comparisons in the control group. The estimator must either extrapolate (comparing to distant, dissimilar control units, introducing bias), exclude those units (which changes the estimand — you are now estimating the treatment effect only for units with good matches, not for all treated units), or fail with very high variance. Failures of common support are often worst precisely for treated units at the extremes of the covariate distribution, where treatment effects may be largest.
The overlap condition is the matching analogue of the positivity assumption in causal inference: every unit must have some nonzero probability of receiving either treatment or control. Without it, counterfactual comparisons are not empirically grounded — you are asking 'what would this type of unit look like untreated?' when no untreated units of that type exist in the data. This is why comparing the covariate distributions of treated and control groups visually before estimating is a crucial diagnostic step.