← Graph View All Domains

A topic in the Open Knowledge Graph — a free, open map of 15,290 topics and the order to learn them in.

Prediction Intervals in Regression

College Depth 104 in the knowledge graph ☐ I know this ☆ Set as goal

196topics build on this

547prerequisites beneath it

See this on the map →

Inference in Linear Regression Bayesian Statistics: Prior, Posterior, Credible Intervals→→Linear Regression Basics

Core Idea

A prediction interval estimates where a new individual observation will fall; a confidence interval estimates the mean response. Prediction intervals are wider because they include both uncertainty in estimating the mean and natural variation around the mean.

Explainer

From your work with inference in linear regression, you know that the fitted line ŷ = β̂₀ + β̂₁x is itself uncertain — it's estimated from data, so it wobbles depending on which sample you draw. A confidence interval for the mean response captures exactly this uncertainty: at a given x value, where might the true population mean μ_Y|x lie? That interval shrinks as sample size grows, because with more data the estimated line stabilizes around the truth.

A prediction interval asks a different and harder question: where will the *next single observation* at that x value land? Even if you knew the regression line perfectly — even with infinite data — individual observations would still scatter around it. That scatter is the irreducible noise term ε, with variance σ². A prediction interval must account for *both* sources of uncertainty: the estimation uncertainty in the mean (which goes to zero as n → ∞) and the irreducible observation-to-observation variance (which does not).

Mathematically, the prediction interval at a given x* is ŷ* ± t* · SE_pred, where SE_pred² = s²(1 + h), with h capturing the leverage of x* and the "1" term being the irreducible variance contribution. The "1 +" is the essential difference: the confidence interval uses SE² = s² · h alone, without the leading 1. Because SE_pred > SE_mean always, prediction intervals are always wider — often substantially so, especially for small samples.

The practical lesson is to match the interval to the question. If you want to know the expected height of all 40-year-old men in a population, use a confidence interval for the mean. If you want to know where one specific 40-year-old man's height will fall, use a prediction interval. Confusing them leads to either false precision (using a confidence interval when you need a prediction interval) or unnecessary alarm (the opposite direction). The confidence interval tells you about the center of a distribution; the prediction interval tells you about the distribution itself.

As x* moves away from x̄ (the center of your data), both interval types widen — leverage h increases the farther you extrapolate. But the prediction interval widens more slowly in relative terms because the "1" dominates when h is small. Near the center of the data, prediction intervals are roughly twice as wide as confidence intervals; far into extrapolation territory, both balloon together. This is why extrapolation with a prediction interval makes caution concrete: you can literally see how much uncertainty you're projecting onto a single future observation.

Practice Questions 5 questions

Prerequisite Chain

Understanding Zero → The Number Zero → Counting to Five → Counting to 10 → Counting to 20 → Counting a Set of Objects Up to 20 → Cardinality: The Last Number Counted → Matching Numerals to Quantities → Subitizing Small Quantities → Addition Within 10 → Number Bonds to 10 → Addition Within 20 → Doubles and Near Doubles → Doubles Facts Within 10 → Near Doubles Facts Within 20 → Mental Math Strategies for Addition → Mental Math: Adding and Subtracting Tens → Addition Within 100 → Repeated Addition as Multiplication → Multiplication as Equal Groups → Multiplication: Arrays → Basic Multiplication Facts (0s, 1s, 2s, 5s, 10s) → Multiplication Facts Within 100 → Division as Equal Sharing → Division as Grouping (Measurement Division) → Division: Grouping (Repeated Subtraction) Model → Division: Fair Sharing Model → Division as Equal Sharing → Division as Grouping → Basic Division Facts → Division Facts Within 100 → Multiplication and Division Fact Families → Relationship Between Multiplication and Division → Division Facts as Inverse of Multiplication → Remainders and Quotients in Division → Division Word Problems → Multi-Step Word Problems → Solving Multi-Step Word Problems → Multiplication Word Problems → Division Word Problems → Introduction to Long Division → Factors and Multiples → Prime and Composite Numbers → Equivalent Fractions → Relating Fractions and Decimals → Decimal Place Value → Integers and the Number Line → Comparing and Ordering Integers → Absolute Value → Adding Integers → Subtracting Integers → Multiplying Integers → Dividing Integers → Unit Rates → Proportions → Percent Concept → Converting Between Fractions, Decimals, and Percents → Operations with Rational Numbers → Two-Step Equations → Solving Multi-Step Equations → Equations with Variables on Both Sides → Angle Pairs: Complementary, Supplementary, and Vertical → Parallel Lines and Transversals → Corresponding Angles → Alternate Interior Angles → Triangle Angle Sum Theorem → Exterior Angle Theorem → Triangle Inequality Theorem → Similar Triangles: AA Similarity → Similar Triangles: SSS and SAS Similarity → Proportions in Similar Triangles → Right Triangle Trigonometry Introduction → Sine, Cosine, and Tangent Ratios → Trigonometric Ratios Review → Radian Measure → Converting Between Degrees and Radians → The Unit Circle → Graphing Sine and Cosine → Graphing Tangent and Reciprocal Trigonometric Functions → Derivatives of Trigonometric Functions → Antiderivatives → Indefinite Integrals → Basic Integration Rules → Riemann Sums → Definite Integral Definition → Probability Density Functions and Continuous Distributions → Cumulative Distribution Functions → Continuous Random Variables → Probability Density Functions → Expected Value → Weak Law of Large Numbers → Probability Axioms and Rules → Conditional Probability → Independence of Events → Sampling Distributions → Standard Error of Estimators → Hypothesis Testing: Framework and Logic → P-values and Statistical Significance → Effect Size and Practical Significance → Hypothesis Testing: Framework and Logic → Z-Tests and T-Tests for Means → One-Sample Z-Test for Means → One-Sample and Two-Sample T-Tests → Inference in Linear Regression → Prediction Intervals in Regression

Longest path: 105 steps · 547 total prerequisite topics

Prerequisites (2)

Inference in Linear Regressionhard Bayesian Statistics: Prior, Posterior, Credible Intervalssoft

Leads To (1)

Linear Regression Basicssoft