Questions: Confidence Intervals and Hypothesis Tests in Regression
5 questions to test your understanding
Score: 0 / 5
Question 1 Multiple Choice
After running a regression, a researcher reports: 'There is a 95% probability that the true coefficient β lies between 0.3 and 0.8.' What is wrong with this statement?
ANothing — this is the standard frequentist interpretation of a 95% confidence interval
BThe interval should use z-critical values, not t-critical values, for a 95% interval
CThe true coefficient β is a fixed unknown constant, not a random variable. The interval is what varies across samples. The correct statement is that if this procedure were repeated many times, 95% of such intervals would contain the true β
DThe researcher should report a prediction interval, not a confidence interval, for inference about β
This is the most common misinterpretation of confidence intervals. In frequentist statistics, the true parameter β is a fixed (unknown) constant — it either is or isn't in any particular interval, with probability 1 or 0, not 0.95. What has a 95% probability is the *procedure*: if you repeatedly drew samples from the population and computed intervals this way, 95% of those intervals would contain the true β. Any given computed interval either covers β or it doesn't. The probabilistic statement applies to the random interval before data are collected, not to the fixed interval after you have computed it. Option A is the misconception; Option C is the correction.
Question 2 Multiple Choice
A 95% confidence interval for a slope coefficient β̂_j is computed as [0.15, 0.85]. A colleague runs a t-test for H₀: β_j = 0 at the 5% significance level. What should they find?
AThe t-test cannot be determined from the confidence interval alone — they test different things
BThe t-test fails to reject H₀, because the interval is wide and therefore uncertain
CThe t-test rejects H₀ at 5%, because 0 falls outside the 95% confidence interval — the interval and test are perfectly dual
DThe t-test result depends on the sample size, which the confidence interval does not convey
The confidence interval and t-test are mathematically dual: the 95% CI contains exactly the values of c for which a two-sided t-test of H₀: β_j = c would fail to reject at the 5% level. Since 0 is outside [0.15, 0.85], the t-test of H₀: β_j = 0 rejects at the 5% level. Equivalently, if 0 were inside the interval, the test would fail to reject. This duality means you can read hypothesis test results from confidence intervals: the interval communicates both the significance decision and the magnitude of the effect, making it more informative than the t-statistic alone.
Question 3 True / False
A 95% confidence interval constructed using OLS standard errors has a 95% probability of containing the true β in large samples, regardless of whether errors are heteroskedastic.
TTrue
FFalse
Answer: False
When errors are heteroskedastic, the classical OLS standard errors are inconsistent — they typically underestimate the true sampling uncertainty. This means the computed confidence intervals are too narrow, and they cover the true β less than 95% of the time. In practice, heteroskedastic errors cause over-rejection: tests that should fail to reject at 5% do reject, and CIs that should contain β don't. Correct coverage requires using heteroskedasticity-robust (HC) standard errors. The OLS coefficient estimator β̂ is still unbiased under heteroskedasticity — it's the standard errors (and therefore the intervals) that break down.
Question 4 True / False
Because the confidence interval and the hypothesis test are mathematically equivalent, reporting a p-value conveys strictly less information than reporting the confidence interval.
TTrue
FFalse
Answer: True
A p-value gives a binary (reject/fail to reject) answer for one specific null hypothesis (usually H₀: β = 0). The confidence interval communicates the same decision (does 0 fall outside it?) but also shows the range of plausible values for β and the precision of the estimate. Knowing β̂ = 0.5 with CI [0.48, 0.52] versus CI [0.1, 0.9] conveys very different information about the reliability and practical importance of the effect, even though both might reject H₀: β = 0. This is why journals increasingly require reporting of confidence intervals alongside p-values — the p-value alone discards information about effect size and uncertainty.
Question 5 Short Answer
Why does the presence of heteroskedasticity invalidate standard OLS confidence intervals, and what is the appropriate remedy?
Think about your answer, then reveal below.
Model answer: OLS confidence intervals use the formula β̂ ± t* · se(β̂), where se(β̂) is the estimated standard error. The classical OLS standard error formula assumes homoskedastic errors (constant variance). When errors are heteroskedastic (variance differs across observations), the classical formula produces inconsistent estimates of the true sampling variance — typically underestimates — so the standard errors are too small, the intervals too narrow, and they fail to cover the true β at the claimed rate. The remedy is to use heteroskedasticity-robust (White or HC) standard errors, which estimate the true sampling variance consistently under heteroskedasticity. The OLS coefficient estimates β̂ are unchanged; only the standard errors used to construct intervals and test statistics change.
The key insight is that β̂ itself is not affected by heteroskedasticity (it remains unbiased and consistent), but inference — confidence intervals, t-statistics, p-values — all depend entirely on having correct standard errors. Getting standard errors right is not a refinement; it determines whether your statistical conclusions are reliable. Cluster-robust standard errors extend this logic to serial correlation within groups.