Questions: Standard Error Calculation and Correction Methods
5 questions to test your understanding
Score: 0 / 5
Question 1 Multiple Choice
You study whether minimum wage laws affect employment using annual data on all workers across 50 US states over 10 years. Minimum wage policy varies at the state level. Which standard error method is most appropriate?
AConventional OLS SEs — the large sample size makes them reliable
BRobust (Huber-White) SEs — heteroskedasticity is likely across states of different sizes
CClustered SEs by state — workers within a state share the same policy treatment and correlated error shocks
DNo standard errors — full population data makes statistical inference unnecessary
The key variation in treatment (minimum wage policy) is at the state level, and all workers in the same state share identical policy exposure and common state-level shocks (economic conditions, industry mix). Treating each worker as an independent observation dramatically overstates the effective sample size — workers in the same state carry redundant information about the policy. Clustered SEs by state properly account for within-cluster correlation. Robust SEs address heteroskedasticity but not within-group correlation, so they're insufficient here.
Question 2 Multiple Choice
A researcher computes both conventional OLS SEs and robust (Huber-White) SEs for the same regression. The robust SEs are noticeably larger. What does this signal?
AThe model is misspecified and needs to be re-estimated with different controls
BThe data exhibits heteroskedasticity — error variance varies across observations — making conventional SEs underestimate true uncertainty
CRobust SEs are always larger than conventional SEs by construction, so this result is uninformative
DThe sample size is too small for OLS assumptions to hold
When robust SEs exceed conventional SEs, it confirms that error variance is not constant across observations (heteroskedasticity). Conventional SEs assume a single σ² and underestimate uncertainty when some observations have larger residuals than others. Robust SEs let each observation's squared residual contribute differently via the sandwich estimator. Option C is wrong: if the data were truly homoskedastic, robust and conventional SEs converge — larger robust SEs are informative evidence of violated assumptions.
Question 3 True / False
Using conventional OLS standard errors when error terms within groups are correlated can produce false positives — making a coefficient appear statistically significant when the true effect is zero.
TTrue
FFalse
Answer: True
Within-cluster correlation means many observations contain the same information — they are not truly independent. Conventional SEs treat all observations as independent, overstating effective sample size and producing artificially small SEs and inflated t-statistics. A t-statistic of 2.5 with conventional SEs might drop to 0.9 with clustered SEs, flipping the conclusion entirely. This is why SE choice is a validity issue, not a cosmetic one.
Question 4 True / False
Clustered standard errors are typically larger than robust (Huber-White) standard errors for the same regression.
TTrue
FFalse
Answer: False
Clustered SEs are typically larger than robust SEs when within-cluster correlation is substantial, because they effectively reduce the information content to the number of clusters. But if errors are nearly independent within clusters — meaning observations in the same cluster are not actually similar — clustered SEs can be similar to or even smaller than robust SEs. The relationship depends on the actual correlation structure in the data, not a universal rule.
Question 5 Short Answer
Explain why choosing the wrong standard error method (e.g., conventional SEs when clustering is needed) is a validity problem rather than just a technical imprecision.
Think about your answer, then reveal below.
Model answer: Standard errors feed directly into t-statistics and hypothesis tests. Using the wrong SE can change a t-statistic by a factor of two or more, turning a 'statistically significant' result into a null finding or vice versa. This means the choice of SE method determines what claims can be made about whether a variable has a true effect — it is a question of truth, not precision. Published findings built on incorrect SEs may be entirely spurious.
The text makes this explicit: picking the wrong SE type 'can change t-statistics by factors of two or more, turning apparent significance into noise.' If a coefficient has t = 2.4 with conventional SEs but t = 0.8 with clustered SEs, the conclusion changes from 'reject the null at 5%' to 'cannot reject.' This is not a small correction to confidence interval width — it is the difference between a positive finding and a null result. Treating SE choice as merely technical obscures a substantive decision about uncertainty representation.