Questions: Serial Correlation (Autocorrelation) in Regression
5 questions to test your understanding
Score: 0 / 5
Question 1 Multiple Choice
A researcher runs a time-series regression and finds that residuals display long runs of positive values followed by long runs of negative values. What is the primary statistical consequence?
AThe OLS coefficient estimates are biased — they systematically over- or underestimate the true relationship
BThe coefficient estimates remain unbiased and consistent, but OLS standard errors understate true uncertainty, inflating t-statistics and making results appear more significant than they are
CThe regression cannot be estimated at all because the Gauss-Markov theorem is violated
DOnly the intercept estimate is affected; slope coefficients are unaffected by serial correlation in errors
Serial correlation violates the Gauss-Markov assumption of uncorrelated errors but does NOT bias OLS coefficient estimates — they remain unbiased and consistent. The damage is to inference. When consecutive errors are positively correlated, observations carry redundant information: the effective sample size for estimating uncertainty is smaller than the nominal sample size. OLS treats all observations as independent and therefore underestimates the true variance of the estimator. The resulting standard errors are too small, t-statistics too large, and confidence intervals too narrow — systematic overconfidence in results.
Question 2 Multiple Choice
A researcher reports a Durbin-Watson statistic of 0.4 for their time-series regression. What does this indicate, and what is the appropriate remedy?
ADW ≈ 0.4 indicates strong negative autocorrelation; the remedy is to add more lags to the model
BDW ≈ 0.4 indicates strong positive autocorrelation (since DW ≈ 2(1-ρ), so ρ ≈ 0.8); the remedy is HAC (Newey-West) standard errors or GLS with AR(1) error structure
CDW ≈ 0.4 is in the inconclusive region; no action is needed until it falls below 0
DDW ≈ 0.4 is close enough to zero to indicate heteroskedasticity rather than autocorrelation
The Durbin-Watson statistic is approximately DW ≈ 2(1 − ρ̂), where ρ̂ is the first-order autocorrelation of residuals. DW near 2 means ρ ≈ 0 (no autocorrelation); DW near 0 means ρ ≈ 1 (strong positive autocorrelation); DW near 4 means ρ ≈ −1 (strong negative autocorrelation). DW = 0.4 implies ρ̂ ≈ 0.8 — strong positive serial correlation. The standard remedy is Newey-West HAC standard errors, which are valid for autocorrelation of unknown form and provide correct inference without requiring a specific error model.
Question 3 True / False
Serial correlation in regression errors typically causes OLS standard errors to understate true uncertainty, leading to inflated t-statistics.
TTrue
FFalse
Answer: True
Positive serial correlation means consecutive errors carry similar signs — the observations are not as informationally independent as OLS assumes. OLS standard errors are derived under the assumption that each observation adds independent information, so they systematically underestimate the true variance of the coefficient estimator when observations are actually correlated. The result is t-statistics that are too large, p-values too small, and confidence intervals too narrow. This is why time-series regressions reported with default OLS standard errors and no robustness correction should be viewed skeptically.
Question 4 True / False
When serial correlation is detected in regression residuals, the OLS coefficient estimates are biased and should be recalculated using GLS.
TTrue
FFalse
Answer: False
This is the central misconception about serial correlation. Unlike omitted variable bias or endogeneity, serial correlation in errors does NOT bias OLS coefficient estimates — they remain unbiased and consistent. The problem is entirely in the standard errors (and therefore inference). GLS corrects efficiency and standard errors, not the point estimates. The recommendation to use GLS or Newey-West is about getting valid p-values and confidence intervals, not about fixing biased slopes. Practitioners who re-estimate slopes due to serial correlation are solving the wrong problem.
Question 5 Short Answer
Why does serial correlation in regression errors cause standard errors to be understated, even though the coefficient estimates themselves are unbiased?
Think about your answer, then reveal below.
Model answer: OLS standard errors are calculated under the assumption that each observation provides independent information about the regression relationship. Serial correlation means consecutive observations are not independent — they carry overlapping information because each error is partially predictable from the previous one. The effective sample size for estimating the uncertainty of the coefficients is smaller than the nominal sample size N. OLS doesn't know this and uses N as if all observations were independent, producing standard errors that are too small. The coefficient estimates are still correct in expectation — each observation still correctly identifies the average relationship — but the uncertainty around those estimates is underreported.
An analogy: asking 100 people for the time when 90 of them synchronized their watches gives you much less independent information than asking 100 fully independent people. OLS treats the 100 correlated observations as fully independent and reports a small standard error, when in fact you have the effective information of far fewer independent observations. Newey-West corrects for this by explicitly estimating the long-run variance, accounting for the autocorrelation structure.