An economist collects data on prices and quantities from competitive markets and runs OLS to estimate a demand curve. Why will this procedure produce a biased estimate?
APrice and quantity are simultaneously determined by both supply and demand, so market price is correlated with the demand equation's error term
BThe sample of market observations is too small for OLS to be reliable in competitive markets
CQuantity demanded is always measured with classical error in market data, attenuating the coefficient
DThe true demand relationship is nonlinear, making OLS the wrong estimator regardless of endogeneity
This is the simultaneity problem: in any market, price and quantity are jointly determined by the intersection of supply and demand. The price you observe is not set exogenously — it reflects both supply and demand conditions simultaneously. When demand shifts (a shock in the error term), price changes too, creating Cov(price, demand error) ≠ 0. OLS therefore recovers neither the demand curve nor the supply curve — it traces out a cloud of equilibrium points driven by shocks to both equations. The only solution is an instrument that shifts one curve while leaving the other unchanged.
Question 2 Multiple Choice
A researcher estimates the effect of true worker productivity on wages, but productivity is measured by noisy supervisor ratings (X = X* + v, where v is random noise). Compared to the true effect, the OLS estimate will be:
ABiased toward zero — the noise in X attenuates the estimated coefficient below the true value
BBiased upward — random noise in X inflates the apparent relationship with Y
CUnbiased — random noise in X averages to zero in large samples, leaving the estimate consistent
DBiased toward zero, but only if the noise v is correlated with true productivity X*
Attenuation bias is the systematic result of classical measurement error in a regressor. The noise v ends up in the error term and creates negative covariance between the mismeasured X and the error, violating the OLS zero-conditional-mean assumption. The estimated coefficient shrinks toward zero — specifically, it equals the true coefficient multiplied by the reliability ratio (the fraction of X's variance that is true signal). This does not average away in larger samples; it is a consistency failure. Option C is wrong precisely because the endogeneity makes OLS inconsistent. Option D is wrong because attenuation occurs even when v is independent of X*.
Question 3 True / False
Endogeneity is a consistency problem in OLS — the bias does not shrink as the sample size grows to infinity.
TTrue
FFalse
Answer: True
This is what distinguishes endogeneity from mere imprecision. If OLS is inconsistent (as it is when Cov(X, u) ≠ 0), the estimator converges to the wrong value as n → ∞. More data does not help — it just makes you more precisely wrong. This is why endogeneity is treated as a fundamental identification problem requiring a different estimator (IV, fixed effects, RD, DiD), not as a sample-size problem that can be solved by collecting more observations.
Question 4 True / False
Measurement error in the dependent variable Y causes attenuation bias in OLS estimates, just as measurement error in a regressor X does.
TTrue
FFalse
Answer: False
This is a critical distinction stated in the common misconceptions. Classical measurement error in Y (the dependent variable) simply adds noise to the outcome: it inflates the error term's variance and reduces precision, but it does not create correlation between the regressors and the error. OLS remains unbiased and consistent. Only measurement error in a regressor X violates the OLS assumption E(u|X) = 0, because the noise from X ends up in the error term and is necessarily correlated with the mismeasured X. The asymmetry — error in Y is harmless, error in X is not — surprises many students.
Question 5 Short Answer
Explain, using the concept of correlation with the error term, why omitting a relevant variable causes endogeneity — and how you can predict the direction of the resulting bias.
Think about your answer, then reveal below.
Model answer: When a relevant variable Z is omitted from a regression, it becomes part of the error term u. If Z is also correlated with an included regressor X, then X and u are correlated — violating E(u|X) = 0 and causing endogeneity. OLS cannot distinguish the effect of X from the effect of Z, so it attributes to X some of the variation in Y that actually comes from Z. The direction of bias follows a simple rule: bias = (Z's effect on Y) × (correlation of Z with X). If Z raises Y and is positively correlated with X, the bias is upward (X's coefficient is overstated). If Z raises Y but is negatively correlated with X, the bias is downward. This formula lets researchers predict which direction OLS will be wrong before collecting data.
The omitted variable bias formula — sign(bias) = sign(β_Z) × sign(Corr(Z, X)) — is one of the most useful tools in applied econometrics. It transforms endogeneity from an abstract concern into a concrete, directional prediction. In the education-wages example: ability raises wages (positive β) and is positively correlated with education, so the education coefficient is biased upward. This prediction can be tested against IV estimates and can guide the choice of instruments.