A researcher estimates the effect of education on wages with a bivariate regression and gets β̂₁ = 0.12. She then adds years of experience as a control and gets β̂₁ = 0.09. Which interpretation is correct?
AThe bivariate estimate is wrong; 0.09 is the true effect of education.
BThe multiple regression estimate represents the effect of education holding experience constant, while the bivariate estimate does not.
CAdding experience variables always reduces coefficients, so this is expected and uninformative.
DThe two estimates cannot be compared because they are from different models.
The coefficient on education in the multiple regression is the partial effect — how wages change with one more year of education when experience is held fixed. The bivariate estimate conflates the direct effect of education with any correlation between education and experience. Neither is unconditionally 'wrong'; they answer different questions.
Question 2 True / False
Adding more control variables to a multiple regression model typically improves the accuracy of coefficient estimates.
TTrue
FFalse
Answer: False
This is the most important misconception about multiple regression. Including irrelevant variables reduces efficiency (increases standard errors) without reducing bias. Worse, including endogenous controls — variables that are themselves caused by the regressors or the outcome — can introduce new bias and make estimates less reliable. More controls is not a free lunch.
Question 3 Short Answer
What condition is required for the matrix estimator β̂ = (X'X)⁻¹X'y to exist, and what economic situation would violate it?
Think about your answer, then reveal below.
Model answer: The matrix (X'X) must be invertible, which requires no perfect multicollinearity — no regressor can be an exact linear combination of others. This is violated if, for example, you include both 'income in dollars' and 'income in thousands of dollars' as separate regressors, since one is exactly 1000 times the other.
Perfect multicollinearity makes the normal equations singular: the system has infinitely many solutions because the collinear variables cannot be separately identified. Near-perfect multicollinearity (high but not exact correlation) is the more common practical problem — it does not prevent estimation but inflates standard errors severely.