The Gauss-Markov theorem states that OLS is the Best Linear Unbiased Estimator (BLUE) when six classical assumptions hold: linearity in parameters, random sampling, no perfect multicollinearity, zero conditional mean of errors (E[u|x]=0), homoskedasticity, and no serial correlation. The most critical assumption is E[u|x]=0, which requires that all determinants of y omitted from the model are uncorrelated with x. When this assumption fails — due to omitted variables, measurement error, or simultaneity — OLS estimates are biased and inconsistent. The remaining assumptions govern efficiency rather than unbiasedness.
Work through examples of each assumption violation — simulate data with heteroskedastic errors, then see how OLS still estimates coefficients correctly (unbiased) but standard errors are wrong. This separates biasedness from inefficiency.
When you learned bivariate regression, you found a formula that fits a line through data. The Gauss-Markov theorem tells you when that line can be trusted as more than a description of the sample — specifically, when OLS is the Best Linear Unbiased Estimator (BLUE) for the population parameters. Understanding the theorem means understanding which assumptions are doing what.
The six classical assumptions can be grouped by what they protect. The first three — linearity in parameters, random sampling, and no perfect multicollinearity — are structural requirements that make estimation possible at all. If the model is nonlinear in parameters, or if two regressors are perfectly collinear, OLS simply cannot produce a unique solution. These assumptions are often satisfied by construction.
The fourth assumption, E[u|x] = 0, is the most critical and the most likely to fail. It says that the expected value of the error term, conditional on x, is zero — in other words, knowing x tells you nothing about the average size of the unobserved factors in u. This is the exogeneity condition. It fails whenever an omitted variable is correlated with x (omitted variable bias), when x is measured with error (attenuation bias), or when x and y jointly determine each other (simultaneity). When E[u|x] ≠ 0, the coefficient estimates are biased and inconsistent — no amount of additional data will fix the problem.
The fifth and sixth assumptions — homoskedasticity (constant error variance) and no serial correlation — govern efficiency, not unbiasedness. When these fail, OLS remains unbiased and consistent, but it is no longer the minimum-variance estimator among linear unbiased estimators. In practice, heteroskedasticity is extremely common (error variance often grows with income, firm size, or other scale variables), and the fix is straightforward: use heteroskedasticity-robust standard errors. The coefficients themselves are kept; only the standard errors are corrected.
A common confusion arises from the word "linearity" in the first assumption. The linearity requirement applies to the parameters β — the model must be linear in β — not to the functional form of the regressors. A model with x, x², and log(x) on the right-hand side is perfectly linear in parameters and satisfies the assumption. This flexibility means OLS can handle a wide range of nonlinear relationships between y and x, as long as the model remains linear in the unknowns you are estimating.