A topic in the Open Knowledge Graph — a free, open map of 15,290 topics and the order to learn them in.

Hypothesis Tests and Inference in Regression

College Depth 107 in the knowledge graph ☐ I know this ☆ Set as goal

553prerequisites beneath it

Simple Linear Regression: Theory and Estimation→

Core Idea

Test H₀:β₁=0 using T=(β₁−0)/SE(β₁) with n−2 df. Confidence interval for β₁: β₁±t_{n-2,α/2}·SE(β₁). F-test for overall model. Prediction intervals widen with distance from X̄ and with increased residual variation.

Explainer

From simple linear regression you know how to compute β̂₁ — the OLS estimate of the slope — from a sample of n observations. But β̂₁ is a statistic, not a parameter. Every new sample would give a slightly different slope. The central insight of regression inference is that β̂₁ has its own sampling distribution: under standard assumptions (linearity, constant variance, uncorrelated errors), β̂₁ is normally distributed with mean equal to the true population slope β₁ and standard error SE(β̂₁) = s / √(Σ(xᵢ − x̄)²), where s is the residual standard error. We cannot observe β₁ directly, but we can reason probabilistically about where it lies.

The most common question is whether the predictor matters at all — does X have a linear relationship with Y in the population? This is formalized as H₀: β₁ = 0. The t-statistic T = β̂₁ / SE(β̂₁) measures how many standard errors the estimate is from zero. Under H₀, T follows a t-distribution with n − 2 degrees of freedom (we lose two for estimating β₀ and β₁). A large |T| means the slope is far from zero relative to its sampling uncertainty, giving evidence against H₀. The p-value is the probability of observing a t-statistic at least as extreme, assuming H₀ is true. If p < α, we reject H₀ and conclude that X is a statistically significant linear predictor of Y.

A confidence interval for β₁ — β̂₁ ± t_{n−2, α/2} · SE(β̂₁) — inverts the same logic. Rather than asking whether a specific hypothesized value is plausible, the interval reports all values that would not be rejected at level α. An interval that excludes zero is equivalent to rejecting H₀: β₁ = 0 at that level. The F-test for the overall model generalizes to multiple predictors: it tests whether all slopes are simultaneously zero. In simple regression, the F-statistic equals T², so both tests are equivalent and give identical p-values.

Prediction intervals address a different question: where will a *new individual observation* fall, given a particular X value? Unlike a confidence interval for the mean response (which only captures uncertainty about the population mean at X = x*), a prediction interval must also account for residual variation — the irreducible scatter of individual points around the true line. As a result, prediction intervals are always wider than confidence intervals for the mean. Both intervals are narrowest at X = X̄ and widen as X moves away from the mean, because the OLS line is pinned by the data centroid — extrapolation increases uncertainty. The more residual variation in the data (larger s), the wider both intervals become, reflecting genuine uncertainty about the underlying relationship.

Practice Questions 5 questions

Prerequisite Chain

Understanding Zero → The Number Zero → Counting to Five → Counting to 10 → Counting to 20 → Counting a Set of Objects Up to 20 → Cardinality: The Last Number Counted → Matching Numerals to Quantities → Subitizing Small Quantities → Addition Within 10 → Number Bonds to 10 → Addition Within 20 → Doubles and Near Doubles → Doubles Facts Within 10 → Near Doubles Facts Within 20 → Mental Math Strategies for Addition → Mental Math: Adding and Subtracting Tens → Addition Within 100 → Repeated Addition as Multiplication → Multiplication as Equal Groups → Multiplication: Arrays → Basic Multiplication Facts (0s, 1s, 2s, 5s, 10s) → Multiplication Facts Within 100 → Division as Equal Sharing → Division as Grouping (Measurement Division) → Division: Grouping (Repeated Subtraction) Model → Division: Fair Sharing Model → Division as Equal Sharing → Division as Grouping → Basic Division Facts → Division Facts Within 100 → Multiplication and Division Fact Families → Relationship Between Multiplication and Division → Division Facts as Inverse of Multiplication → Remainders and Quotients in Division → Division Word Problems → Multi-Step Word Problems → Solving Multi-Step Word Problems → Multiplication Word Problems → Division Word Problems → Introduction to Long Division → Factors and Multiples → Prime and Composite Numbers → Equivalent Fractions → Relating Fractions and Decimals → Decimal Place Value → Integers and the Number Line → Comparing and Ordering Integers → Absolute Value → Adding Integers → Subtracting Integers → Multiplying Integers → Dividing Integers → Unit Rates → Proportions → Percent Concept → Converting Between Fractions, Decimals, and Percents → Operations with Rational Numbers → Two-Step Equations → Solving Multi-Step Equations → Equations with Variables on Both Sides → Angle Pairs: Complementary, Supplementary, and Vertical → Parallel Lines and Transversals → Corresponding Angles → Alternate Interior Angles → Triangle Angle Sum Theorem → Exterior Angle Theorem → Triangle Inequality Theorem → Similar Triangles: AA Similarity → Similar Triangles: SSS and SAS Similarity → Proportions in Similar Triangles → Right Triangle Trigonometry Introduction → Sine, Cosine, and Tangent Ratios → Trigonometric Ratios Review → Radian Measure → Converting Between Degrees and Radians → The Unit Circle → Graphing Sine and Cosine → Graphing Tangent and Reciprocal Trigonometric Functions → Derivatives of Trigonometric Functions → Antiderivatives → Indefinite Integrals → Basic Integration Rules → Riemann Sums → Definite Integral Definition → Probability Density Functions and Continuous Distributions → Cumulative Distribution Functions → Continuous Random Variables → Probability Density Functions → Expected Value → Weak Law of Large Numbers → Probability Axioms and Rules → Conditional Probability → Independence of Events → Sampling Distributions → Standard Error of Estimators → Hypothesis Testing: Framework and Logic → P-values and Statistical Significance → Effect Size and Practical Significance → Hypothesis Testing: Framework and Logic → Z-Tests and T-Tests for Means → One-Sample Z-Test for Means → One-Sample and Two-Sample T-Tests → Inference in Linear Regression → Prediction Intervals in Regression → Linear Regression Basics → Simple Linear Regression: Theory and Estimation → Hypothesis Tests and Inference in Regression

Longest path: 108 steps · 553 total prerequisite topics

Prerequisites (1)

Simple Linear Regression: Theory and Estimationhard

Leads To (0)

No topics depend on this one yet.