A topic in the Open Knowledge Graph — a free, open map of 15,290 topics and the order to learn them in.

Linear Regression Basics

College Depth 105 in the knowledge graph ☐ I know this ☆ Set as goal

195topics build on this

551prerequisites beneath it

Correlation Coefficient Least Squares Estimation +2 more→→Residuals and Goodness of Fit (R²)Simple Linear Regression: Theory and Estimation

Core Idea

Linear regression fits a line y = a + bx to paired data (xᵢ, yᵢ) by minimizing the sum of squared residuals. The slope b measures the change in y per unit change in x; the intercept a is y when x = 0. The regression line passes through (x̄, ȳ) and its slope is b = r × (s_y/s_x). Linear regression enables prediction and quantifies linear relationships, though predictions outside the data range (extrapolation) are unreliable.

How It's Best Learned

Fit regression lines to scatterplots. Interpret slope in context. Use regression to make predictions and discuss uncertainty. Compare fitted values to observed values (residuals).

Common Misconceptions

Thinking regression assumes causation. Using regression for severely nonlinear data. Extrapolating far beyond the data range with confidence. Confusing the fitted value with the data point.

Explainer

From the correlation coefficient, you know how to measure the *strength* and *direction* of a linear association between two variables. Linear regression goes one step further: it finds the specific line that best describes that association and uses it to make predictions. The method is called least squares because it chooses the line that minimizes the total squared vertical distance between each data point and the line.

The line has the form ŷ = a + bx, where ŷ (read "y-hat") is the *predicted* value of y for a given x. The slope b and intercept a are chosen to minimize Σ(yᵢ − ŷᵢ)², the sum of squared residuals. Why squared? Squaring makes all terms positive (so negative and positive errors don't cancel), and it penalizes large errors more than small ones. The algebra leads to a clean formula for the slope: b = r × (s_y / s_x), where r is the correlation coefficient you already know, s_y is the standard deviation of y, and s_x is the standard deviation of x. This formula shows how tightly regression connects to correlation: if r = 1, the slope is exactly s_y / s_x; if r = 0, the slope is 0 and the best prediction for y is just ȳ regardless of x.

The intercept follows from a key property of the regression line: it always passes through the point (x̄, ȳ), the means of both variables. Once you have the slope b, the intercept is a = ȳ − b × x̄. This means the regression line is anchored at the center of the data and tilted according to the correlation and spread. Interpreting the slope: b says "for every one-unit increase in x, the predicted y changes by b units." Interpreting the intercept: a is the predicted y when x = 0, which may or may not be meaningful depending on whether x = 0 is in the range of your data.

Two important limitations: regression describes association, not causation. Height and shoe size are correlated; fitting a regression doesn't mean height *causes* shoe size. Second, extrapolation — predicting y for an x value far outside your data range — is unreliable. The linear relationship observed in your data may not hold beyond it. A regression of height vs. weight in adults would give nonsense predictions for newborns. The regression line is a summary of the data you have, not a universal law, and the residual yᵢ − ŷᵢ for each point quantifies how far reality deviates from that summary.

Practice Questions 5 questions

Prerequisite Chain

Understanding Zero → The Number Zero → Counting to Five → Counting to 10 → Counting to 20 → Counting a Set of Objects Up to 20 → Cardinality: The Last Number Counted → Matching Numerals to Quantities → Subitizing Small Quantities → Addition Within 10 → Number Bonds to 10 → Addition Within 20 → Doubles and Near Doubles → Doubles Facts Within 10 → Near Doubles Facts Within 20 → Mental Math Strategies for Addition → Mental Math: Adding and Subtracting Tens → Addition Within 100 → Repeated Addition as Multiplication → Multiplication as Equal Groups → Multiplication: Arrays → Basic Multiplication Facts (0s, 1s, 2s, 5s, 10s) → Multiplication Facts Within 100 → Division as Equal Sharing → Division as Grouping (Measurement Division) → Division: Grouping (Repeated Subtraction) Model → Division: Fair Sharing Model → Division as Equal Sharing → Division as Grouping → Basic Division Facts → Division Facts Within 100 → Multiplication and Division Fact Families → Relationship Between Multiplication and Division → Division Facts as Inverse of Multiplication → Remainders and Quotients in Division → Division Word Problems → Multi-Step Word Problems → Solving Multi-Step Word Problems → Multiplication Word Problems → Division Word Problems → Introduction to Long Division → Factors and Multiples → Prime and Composite Numbers → Equivalent Fractions → Relating Fractions and Decimals → Decimal Place Value → Integers and the Number Line → Comparing and Ordering Integers → Absolute Value → Adding Integers → Subtracting Integers → Multiplying Integers → Dividing Integers → Unit Rates → Proportions → Percent Concept → Converting Between Fractions, Decimals, and Percents → Operations with Rational Numbers → Two-Step Equations → Solving Multi-Step Equations → Equations with Variables on Both Sides → Angle Pairs: Complementary, Supplementary, and Vertical → Parallel Lines and Transversals → Corresponding Angles → Alternate Interior Angles → Triangle Angle Sum Theorem → Exterior Angle Theorem → Triangle Inequality Theorem → Similar Triangles: AA Similarity → Similar Triangles: SSS and SAS Similarity → Proportions in Similar Triangles → Right Triangle Trigonometry Introduction → Sine, Cosine, and Tangent Ratios → Trigonometric Ratios Review → Radian Measure → Converting Between Degrees and Radians → The Unit Circle → Graphing Sine and Cosine → Graphing Tangent and Reciprocal Trigonometric Functions → Derivatives of Trigonometric Functions → Antiderivatives → Indefinite Integrals → Basic Integration Rules → Riemann Sums → Definite Integral Definition → Probability Density Functions and Continuous Distributions → Cumulative Distribution Functions → Continuous Random Variables → Probability Density Functions → Expected Value → Weak Law of Large Numbers → Probability Axioms and Rules → Conditional Probability → Independence of Events → Sampling Distributions → Standard Error of Estimators → Hypothesis Testing: Framework and Logic → P-values and Statistical Significance → Effect Size and Practical Significance → Hypothesis Testing: Framework and Logic → Z-Tests and T-Tests for Means → One-Sample Z-Test for Means → One-Sample and Two-Sample T-Tests → Inference in Linear Regression → Prediction Intervals in Regression → Linear Regression Basics

Longest path: 106 steps · 551 total prerequisite topics

Prerequisites (4)

Correlation Coefficienthard Simple Linear Regressionsoft Prediction Intervals in Regressionsoft Least Squares Estimationsoft

Leads To (2)

Residuals and Goodness of Fit (R²)hard Simple Linear Regression: Theory and Estimationsoft