A topic in the Open Knowledge Graph — a free, open map of 15,290 topics and the order to learn them in.

R-Squared and Model Fit

College Depth 113 in the knowledge graph ☐ I know this ☆ Set as goal

103topics build on this

582prerequisites beneath it

Residuals and Goodness of Fit (R²)Simple (Bivariate) OLS Regression +2 more→→Cross-Validation and Out-of-Sample Model Evaluation Information Criteria: AIC and BIC for Model Selection +2 more

Core Idea

R² measures the fraction of variation in y explained by the regressors: R² = 1 − SSR/SST, where SSR is the sum of squared residuals and SST is total variance. It always lies between 0 and 1, and adding any regressor — even irrelevant — cannot decrease it. The adjusted R² penalizes for additional regressors, making it more appropriate for model comparison: R̄² = 1 − [SSR/(n−k−1)]/[SST/(n−1)]. High R² does not imply unbiased coefficient estimates; low R² does not imply the estimates are wrong or the model is useless for causal inference.

How It's Best Learned

Compare R² and adjusted R² across nested models (same data, different regressors). Note that adding noise variables can raise R² but lower R̄².

Common Misconceptions

A low R² (e.g., 0.05) does not invalidate a regression — causal identification is about E[u|x]=0, not explained variance.
R² is not comparable across datasets or when the dependent variable is transformed (e.g., log y vs y).

Explainer

From bivariate regression, you learned how to fit a line through data by minimizing squared residuals — the vertical distances between data points and the fitted line. Those residuals capture what the model fails to explain. R² formalizes this intuition into a single summary statistic: the fraction of the total variation in y that your regression accounts for.

The formula makes the decomposition explicit. Total sum of squares (SST) = Σ(yᵢ − ȳ)² measures the total variation in the outcome around its unconditional mean. Residual sum of squares (SSR) = Σ(yᵢ − ŷᵢ)² is the unexplained variation that remains after fitting the model. R² = 1 − SSR/SST. When the model perfectly fits every data point, SSR = 0 and R² = 1. When the model simply predicts the mean for every observation (no regressors at all), SSR = SST and R² = 0. An R² of 0.60 means the regressors collectively account for 60% of the variation in y; the remaining 40% is unexplained.

A crucial mechanical fact: adding any variable to a regression can never decrease R². OLS can always set a new coefficient to zero if the variable adds nothing, so SSR can only stay flat or fall, meaning R² can only stay flat or rise. This is why comparing R² across models with different numbers of predictors is misleading — you could achieve R² = 0.99 by including enough noise variables. Adjusted R² corrects for this by penalizing the loss of degrees of freedom: R̄² = 1 − [SSR/(n−k−1)] / [SST/(n−1)], where k is the number of regressors. The adjustment means adding a truly uninformative variable can lower R̄², making it a better model comparison tool than raw R².

The deepest point — and the most consequential misconception — is that R² has nothing to do with whether your regression is correctly specified for causal inference. The key OLS assumption for unbiased estimation is E[u|x] = 0: the regressors are uncorrelated with the error term. R² measures explained variance regardless of whether this assumption holds. You can have R² = 0.95 with severe omitted variable bias, and R² = 0.04 with a clean randomized experiment delivering perfectly unbiased coefficients. As you move further into econometrics, you will regularly see researchers report very low R² without apology — they are pursuing credible identification of a causal effect, not maximizing explained variance. The two goals are genuinely separate.

Practice Questions 5 questions

Prerequisite Chain

Understanding Zero → The Number Zero → Counting to Five → Counting to 10 → Counting to 20 → Counting a Set of Objects Up to 20 → Cardinality: The Last Number Counted → Matching Numerals to Quantities → Subitizing Small Quantities → Addition Within 10 → Number Bonds to 10 → Addition Within 20 → Doubles and Near Doubles → Doubles Facts Within 10 → Near Doubles Facts Within 20 → Mental Math Strategies for Addition → Mental Math: Adding and Subtracting Tens → Addition Within 100 → Repeated Addition as Multiplication → Multiplication as Equal Groups → Multiplication: Arrays → Basic Multiplication Facts (0s, 1s, 2s, 5s, 10s) → Multiplication Facts Within 100 → Division as Equal Sharing → Division as Grouping (Measurement Division) → Division: Grouping (Repeated Subtraction) Model → Division: Fair Sharing Model → Division as Equal Sharing → Division as Grouping → Basic Division Facts → Division Facts Within 100 → Multiplication and Division Fact Families → Relationship Between Multiplication and Division → Division Facts as Inverse of Multiplication → Remainders and Quotients in Division → Division Word Problems → Multi-Step Word Problems → Solving Multi-Step Word Problems → Multiplication Word Problems → Division Word Problems → Introduction to Long Division → Factors and Multiples → Prime and Composite Numbers → Equivalent Fractions → Relating Fractions and Decimals → Decimal Place Value → Integers and the Number Line → Comparing and Ordering Integers → Absolute Value → Adding Integers → Subtracting Integers → Multiplying Integers → Dividing Integers → Unit Rates → Proportions → Percent Concept → Converting Between Fractions, Decimals, and Percents → Operations with Rational Numbers → Two-Step Equations → Solving Multi-Step Equations → Equations with Variables on Both Sides → Angle Pairs: Complementary, Supplementary, and Vertical → Parallel Lines and Transversals → Corresponding Angles → Alternate Interior Angles → Triangle Angle Sum Theorem → Exterior Angle Theorem → Triangle Inequality Theorem → Similar Triangles: AA Similarity → Similar Triangles: SSS and SAS Similarity → Proportions in Similar Triangles → Right Triangle Trigonometry Introduction → Sine, Cosine, and Tangent Ratios → Trigonometric Ratios Review → Radian Measure → Converting Between Degrees and Radians → The Unit Circle → Graphing Sine and Cosine → Graphing Tangent and Reciprocal Trigonometric Functions → Derivatives of Trigonometric Functions → Antiderivatives → Indefinite Integrals → Basic Integration Rules → Riemann Sums → Definite Integral Definition → Probability Density Functions and Continuous Distributions → Cumulative Distribution Functions → Continuous Random Variables → Probability Density Functions → Expected Value → Weak Law of Large Numbers → Probability Axioms and Rules → Conditional Probability → Independence of Events → Sampling Distributions → Standard Error of Estimators → Hypothesis Testing: Framework and Logic → P-values and Statistical Significance → Effect Size and Practical Significance → Hypothesis Testing: Framework and Logic → Z-Tests and T-Tests for Means → One-Sample Z-Test for Means → One-Sample and Two-Sample T-Tests → Inference in Linear Regression → Prediction Intervals in Regression → Linear Regression Basics → Residuals and Goodness of Fit (R²) → Simple (Bivariate) OLS Regression → Classical OLS Assumptions (Gauss-Markov) → Multiple Regression → Interpreting Regression Coefficients → Hypothesis Testing in Regression → F-Test and Joint Significance → R-Squared and Model Fit

Longest path: 114 steps · 582 total prerequisite topics

Prerequisites (4)

Simple (Bivariate) OLS Regressionhard Residuals and Goodness of Fit (R²)hard F-Test and Joint Significancesoft Correlation Coefficientsoft

Leads To (4)

Cross-Validation and Out-of-Sample Model Evaluationhard Information Criteria: AIC and BIC for Model Selectionhard Multicollinearitysoft Omitted Variable Biassoft