Two-Stage Least Squares (2SLS)

College Depth 87 in the knowledge graph I know this Set as goal
Unlocks 7 downstream topics
2SLS IV-estimation first-stage weak-instruments

Core Idea

Two-Stage Least Squares (2SLS) is the standard method for IV estimation with one or more instruments. In the first stage, regress the endogenous variable x on all instruments z and exogenous controls, obtaining fitted values x̂. In the second stage, regress y on x̂ and the controls — the coefficient on x̂ is the 2SLS estimate of the causal effect of x. The first-stage F-statistic (rule of thumb: F > 10) tests instrument relevance; a weak first stage inflates 2SLS standard errors severely. With multiple instruments, the overidentification J-test (Hansen-Sargan) provides a partial check on the exclusion restriction.

How It's Best Learned

Implement 2SLS by hand (running two OLS regressions) and then compare to software IV output — note that the second-stage standard errors must be corrected and cannot be taken from the manual second-stage OLS.

Common Misconceptions

Explainer

You already know from instrumental variables that when the key regressor x is endogenous — correlated with the error term because of omitted variables, reverse causality, or measurement error — OLS produces biased and inconsistent estimates. An instrument z provides a solution: a variable that affects x (relevance) but affects the outcome y only through x (exclusion restriction). Two-stage least squares is the mechanical procedure for implementing this idea when you have one or more instruments in hand.

The first stage is an ordinary OLS regression: regress x on the instrument z (and all exogenous controls). This isolates the variation in x that is driven purely by the instrument — call it x̂. Because z is exogenous (uncorrelated with the error term by the exclusion restriction), x̂ is also exogenous. You have essentially purged x of its endogenous component, keeping only the "clean" variation attributable to z.

The second stage is another OLS regression: regress y on x̂ (plus the same controls). The coefficient on x̂ is your 2SLS estimate of the causal effect. The logic is clean: x̂ has been stripped of the problematic correlation with the error term, so regressing y on x̂ recovers an unbiased estimate of how x causally affects y. With multiple endogenous variables, you need at least as many instruments as endogenous regressors — the order condition for identification.

Two diagnostics are essential. First, the first-stage F-statistic tests instrument relevance: a weak instrument — one that explains little of the variation in x — creates a "weak instruments" problem where 2SLS estimates are severely biased toward OLS and standard errors explode. The rule of thumb F > 10 is a minimum threshold, not a guarantee of strength. Second, when you have more instruments than endogenous variables (overidentification), the Hansen-Sargan J-test provides a partial check on exclusion: if the instruments are valid, they should produce the same coefficient estimate regardless of which instrument you use. A significant J-test statistic suggests at least one instrument violates the exclusion restriction — though it cannot tell you which one, and a passing J-test does not prove validity.

The critical implementation note: if you run the two regressions manually in software, the point estimate in the second stage is correct, but the standard errors are wrong. The manual second-stage OLS ignores the sampling variability introduced in the first stage. Always use your software's dedicated IV or 2SLS estimation routine, which computes the correct asymptotic standard errors from the full 2SLS formula.

Practice Questions 5 questions

Prerequisite Chain

Counting to 10Counting to 20Understanding ZeroThe Number ZeroCounting to FiveOne-to-One CorrespondenceCombining Small Groups Within 5Addition Within 10Addition Within 20Two-Digit Addition Without RegroupingTwo-Digit Addition with RegroupingAddition Within 100Repeated Addition as MultiplicationMultiplication Facts Within 100Division as Equal SharingDivision as Grouping (Measurement Division)Division: Grouping (Repeated Subtraction) ModelDivision: Fair Sharing ModelDivision as Equal SharingDivision as GroupingBasic Division FactsDivision Facts Within 100Two-Digit by One-Digit DivisionDivision with RemaindersRemainders and Quotients in DivisionDivision Word ProblemsIntroduction to Long DivisionFactors and MultiplesPrime and Composite NumbersEquivalent FractionsRelating Fractions and DecimalsDecimal Place ValueReading and Writing DecimalsComparing and Ordering DecimalsAdding and Subtracting DecimalsMultiplying DecimalsDividing DecimalsDividing FractionsMixed Number ArithmeticOrder of OperationsInteger Order of OperationsVariable ExpressionsCombining Like TermsOne-Step EquationsTwo-Step EquationsSolving Multi-Step EquationsEquations with Variables on Both SidesAngle Pairs: Complementary, Supplementary, and VerticalParallel Lines and TransversalsCorresponding AnglesAlternate Interior AnglesTriangle Angle Sum TheoremExterior Angle TheoremTriangle Inequality TheoremSimilar Triangles: AA SimilaritySimilar Triangles: SSS and SAS SimilarityProportions in Similar TrianglesRight Triangle Trigonometry IntroductionTrigonometric Ratios ReviewRadian MeasureConverting Between Degrees and RadiansThe Unit CircleGraphing Sine and CosineGraphing Tangent and Reciprocal Trigonometric FunctionsDerivatives of Trigonometric FunctionsAntiderivativesIndefinite IntegralsBasic Integration RulesRiemann SumsDefinite Integral DefinitionProbability Density Functions and Continuous DistributionsCumulative Distribution FunctionsContinuous Random VariablesNormal DistributionCentral Limit TheoremConfidence Intervals for MeansZ-Tests and T-Tests for MeansOne-Sample Z-Test for MeansOne-Sample and Two-Sample T-TestsOne-Way ANOVAF-Test and Joint SignificanceR-Squared and Model FitOmitted Variable BiasCausal Inference and the Identification ProblemPotential Outcomes and the Rubin Causal ModelSelection BiasInstrumental VariablesTwo-Stage Least Squares (2SLS)

Longest path: 88 steps · 429 total prerequisite topics

Prerequisites (3)

Leads To (3)