An instrumental variable (IV) is a variable z that is correlated with the endogenous regressor x (relevance: Cov(z,x)≠0) but affects y only through x and not directly (exclusion restriction: Cov(z,u)=0). When both conditions hold, IV consistently estimates the causal effect of x on y even when OLS is biased. The IV estimator in the bivariate case is β̂ᵢᵥ = Cov(z,y)/Cov(z,x). Classic instruments include distance to college (for education), quarter of birth (for schooling), and rainfall (for agricultural income). The exclusion restriction is the unverifiable — and hence controversial — assumption; its plausibility must be argued on economic grounds.
Study the Angrist-Krueger (1991) quarter-of-birth instrument for education. Discuss why it is (arguably) excluded from the wage equation and what economic story justifies it.
You have already seen that OLS is biased when the regressor x is correlated with the error term u — the endogeneity problem. Instrumental variables offer a way out: find a third variable z that pushes x around but has no independent relationship with y. If you can isolate only the variation in x that z drives, that variation is clean of the omitted variable or reverse causation that corrupted OLS.
The two conditions a valid instrument must satisfy are relevance and the exclusion restriction. Relevance is straightforward: Cov(z,x) ≠ 0, meaning z is actually correlated with x. You can test this directly — regress x on z and check the F-statistic (a rule of thumb is F > 10 for a strong instrument). The exclusion restriction is harder: Cov(z,u) = 0, meaning z is uncorrelated with anything else that drives y. This assumption cannot be tested; it is an economic argument. For the quarter-of-birth instrument, you must argue that the quarter a person happened to be born in has no effect on their adult wages except by changing how long they stayed in school — and that is genuinely controversial.
The bivariate IV estimator is β̂ᵢᵥ = Cov(z,y)/Cov(z,x). The numerator captures how much y changes when z moves; the denominator scales that by how much x changes when z moves. The ratio recovers the causal effect of x on y. Intuitively, you are asking: "Of all the ways z moved x, how much did y move per unit of that x-movement?" The OLS analog, Cov(x,y)/Var(x), uses all variation in x — including the endogenous part. IV uses only the z-driven variation, which is exogenous by assumption.
A critical practical warning: weak instruments are dangerous. If Cov(z,x) is small, the denominator of the IV estimator is close to zero, which amplifies any small violation of the exclusion restriction into a huge bias. Weak instruments can produce estimates that are worse than OLS — biased in the same direction but with false precision. Always report the first-stage F-statistic when presenting IV results.
Finally, IV identifies a Local Average Treatment Effect (LATE) — the causal effect for the subpopulation whose behavior was actually changed by the instrument (the "compliers"). This is not the same as the average treatment effect for the full population. Quarter of birth only shifts education for people who would otherwise have dropped out before compulsory attendance laws required them to stay — not for everyone. Understanding what population your IV result applies to is as important as getting the mechanics right.