A researcher wants to use distance to the nearest college as an instrument for years of education in a wage regression. Which condition is most difficult to satisfy and cannot be verified statistically?
ARelevance — distance to college must be correlated with years of education
BExclusion restriction — distance to college must not directly affect wages
CThe instrument must be binary (0/1)
DThe instrument must be uncorrelated with education
The exclusion restriction requires that the instrument affects the outcome only through the endogenous regressor — here, that distance to college affects wages only by changing education levels, not through any other channel (e.g., local labor markets). This cannot be tested statistically and must be justified on economic grounds. Relevance, by contrast, can be tested with an F-test on the first stage.
Question 2 True / False
If the exclusion restriction holds but your instrument is very weakly correlated with the endogenous regressor, your IV estimate will be more reliable than OLS.
TTrue
FFalse
Answer: False
A weak instrument (low first-stage F-statistic, conventionally below 10) produces IV estimates that are severely biased toward OLS and have extremely wide confidence intervals. The bias comes from finite-sample amplification of any small violation of the exclusion restriction. Weak instruments make IV worse than OLS, not better — strength of the first stage is essential.
Question 3 Short Answer
In the bivariate IV formula β̂ᵢᵥ = Cov(z,y)/Cov(z,x), what is the intuition for why dividing by Cov(z,x) is necessary?
Think about your answer, then reveal below.
Model answer: Cov(z,y) captures the total effect of z on y, but since z only affects y through x, we need to scale by how much z moves x (Cov(z,x)) to isolate the effect of x on y. Dividing by Cov(z,x) essentially asks: for each unit that z shifts x, how much does y change?
The IV estimator uses z as an external shifter of x. Cov(z,y) picks up the z-induced variation in y, and Cov(z,x) measures how strongly z shifts x. Their ratio recovers the causal effect of x on y by scaling the reduced-form effect by the first-stage relationship — analogous to dividing out the channel strength.