Simple (Bivariate) OLS Regression

College Depth 64 in the knowledge graph I know this Set as goal
Unlocks 158 downstream topics
OLS regression estimation

Core Idea

Simple OLS regression fits the line ŷ = β₀ + β₁x that minimizes the sum of squared residuals between observed and predicted values of y. The slope estimator β̂₁ equals Cov(x,y)/Var(x), capturing how much y is predicted to change per unit increase in x. OLS is the default workhorse of empirical economics because it is computationally tractable and, under standard assumptions, produces unbiased and efficient estimates. The intercept β̂₀ gives the predicted value of y when x equals zero, though this is often not economically meaningful.

How It's Best Learned

Derive the OLS formulas by hand from the minimization problem before using software. Then replicate published regressions in a dataset like wage-education data to see how coefficient interpretation works in context.

Common Misconceptions

Explainer

Simple OLS regression answers a basic but important question: given data on two variables x and y, what is the best-fitting straight line through those points, and what does the slope of that line tell us? You have already worked with the correlation coefficient, which measures the strength and direction of a linear relationship. OLS regression goes further — it produces an actual line with a quantified slope that can be used to predict y from x and to estimate by how much y is expected to change for each unit increase in x.

The "best-fitting" line is defined precisely as the one that minimizes the sum of squared residuals — the sum of the squared vertical distances between each observed data point and the corresponding point on the line. This criterion is not arbitrary: summing raw (unsquared) residuals fails because positive and negative errors cancel, making it impossible to distinguish a good fit from a bad one. Squaring forces all residuals to contribute positively, and the unique line that minimizes this sum is the OLS line. Taking the derivative of the sum-of-squares expression with respect to the slope and intercept, setting both to zero, and solving yields closed-form formulas: β̂₁ = Cov(x, y) / Var(x) and β̂₀ = ȳ − β̂₁x̄. These are the OLS estimators.

The slope β̂₁ has a clean interpretation: it is the predicted change in y for a one-unit increase in x, holding everything else constant — though in a bivariate model there is no "everything else," so it simply captures the average linear relationship between the two variables. The intercept β̂₀ is the predicted value of y when x = 0, which is mathematically necessary but often economically meaningless (e.g., the predicted wage when education = 0 years). Notice that β̂₁ equals Cov(x,y)/Var(x): it is the covariance of x and y normalized by the variance of x. A larger covariance means a steeper slope; a larger variance in x (more spread in the predictor) means a flatter slope for the same covariance.

The most important limitation of OLS is that it estimates a conditional mean, not a causal effect. The regression line tells you that, on average in your data, a one-unit increase in x is associated with a β̂₁-unit change in y. It does not tell you that changing x causes y to change by that amount. If education and wages are positively correlated, OLS will give a positive slope — but that slope reflects the sum of every reason why educated people earn more, including unobserved factors like family background or ability that are correlated with both. Establishing causality requires additional assumptions or research designs (instrumental variables, natural experiments, randomized control) that you will study in later topics. For now, treat OLS as a tool for describing associations precisely — which is already enormously useful.

Practice Questions 3 questions

Prerequisite Chain

Counting to 10Counting to 20Understanding ZeroThe Number ZeroCounting to FiveOne-to-One CorrespondenceCombining Small Groups Within 5Addition Within 10Addition Within 20Two-Digit Addition Without RegroupingTwo-Digit Addition with RegroupingAddition Within 100Repeated Addition as MultiplicationMultiplication Facts Within 100Division as Equal SharingDivision as Grouping (Measurement Division)Division: Grouping (Repeated Subtraction) ModelDivision: Fair Sharing ModelDivision as Equal SharingDivision as GroupingBasic Division FactsDivision Facts Within 100Two-Digit by One-Digit DivisionDivision with RemaindersRemainders and Quotients in DivisionDivision Word ProblemsIntroduction to Long DivisionFactors and MultiplesPrime and Composite NumbersEquivalent FractionsRelating Fractions and DecimalsDecimal Place ValueReading and Writing DecimalsComparing and Ordering DecimalsAdding and Subtracting DecimalsMultiplying DecimalsDividing DecimalsDividing FractionsMixed Number ArithmeticOrder of OperationsInteger Order of OperationsVariable ExpressionsCombining Like TermsOne-Step EquationsTwo-Step EquationsSolving Multi-Step EquationsEquations with Variables on Both SidesLiteral EquationsSlope-Intercept FormPoint-Slope FormWriting Linear EquationsParallel and Perpendicular Line SlopesGraphing Linear EquationsPiecewise FunctionsStep FunctionsComposition of FunctionsInverse FunctionsRadical Functions and GraphsRational ExponentsExponential Functions and GraphsGeometric Sequences and SeriesSigma NotationExpected ValueVariance and Standard Deviation of Random VariablesSimple (Bivariate) OLS Regression

Longest path: 65 steps · 311 total prerequisite topics

Prerequisites (8)

Leads To (7)