Causal Inference from Observational Data

Graduate Depth 74 in the knowledge graph I know this Set as goal
Unlocks 15 downstream topics
causal-inference potential-outcomes confounding identification

Core Idea

Synthesizes strategies for inferring causation from observational data when randomization is impossible or unethical. Covers the causal hierarchy (association, experimental, natural experiment), potential outcomes framework, confounding, backdoor and frontdoor criteria, and conditions for causal identification.

How It's Best Learned

Draw directed acyclic graphs (DAGs) for research questions, identify confounders, write causal models, discuss identification assumptions, evaluate whether different designs meet assumptions.

Common Misconceptions

Explainer

You have learned to run regressions and interpret correlations. But correlation is not causation — and more usefully, there is now a rigorous mathematical framework for specifying exactly when and why an observed correlation can and cannot be interpreted causally. That framework is the subject of this topic.

The foundation is the potential outcomes framework, developed by Donald Rubin and extended by Judea Pearl and others. For any unit i and a binary treatment T, we define two potential outcomes: Y_i(1), what would happen to unit i if assigned to treatment, and Y_i(0), what would happen if not. The individual causal effect is the difference Y_i(1) − Y_i(0). The problem is that we observe only one of these — whichever treatment state actually occurred. The other is a counterfactual: what would have happened in a world that did not occur. This is the fundamental problem of causal inference: it is a logical impossibility, not a data gap. No sample size, no matter how large, allows you to observe both potential outcomes for the same unit at the same time.

Randomization solves this problem in expectation. If treatment assignment is truly random, then the treated and untreated groups are identical in expectation across all observed and unobserved characteristics. The observed difference in outcomes is then an unbiased estimate of the average treatment effect. But randomization is often impossible — you cannot randomly assign people to smoke, grow up poor, or experience a policy implemented everywhere simultaneously. Most data is observational, and observational data requires you to defend causal identification through explicit design arguments.

Directed acyclic graphs (DAGs) are the tool for making those arguments transparent. In a DAG, variables are nodes and causal relationships are directed arrows. A confounder is a common cause of both treatment and outcome that creates a non-causal association between them; it must be blocked by conditioning. A mediator lies on the causal path from treatment to outcome; conditioning on it blocks part of the effect you are trying to measure. A collider is caused by both treatment and outcome; conditioning on it opens a spurious path that was previously blocked — making the estimate worse, not better. The backdoor criterion formalizes which sets of variables, when conditioned on, close all non-causal paths without opening new ones. Getting this right requires understanding the data-generating process, not just running variable-selection algorithms.

Three common misconceptions are worth internalizing directly. First, "correlation never implies causation" is too strong a rule — under the right design assumptions, observational correlations can be interpreted causally. The question is always whether those assumptions are defensible, not whether causation is categorically off the table. Second, "add more controls" is not always better — colliders are the clearest counterexample, and there are others. Third, "unconfoundedness can be tested" is wrong by construction: unconfoundedness is an assumption about unmeasured variables, and unmeasured variables cannot be used to test assumptions about themselves. What can be done is sensitivity analysis — testing how large an unmeasured confounder would need to be to overturn your conclusion. Honest causal work states assumptions clearly, defends them on substantive grounds, and reports what would falsify them.

What did you take from this?

Topics in reflective domains aren't scored by quiz answers. Read, reflect, and mark when you've thought it through.

Quiz me anyway →

Prerequisite Chain

Counting to 10Counting to 20Understanding ZeroThe Number ZeroCounting to FiveOne-to-One CorrespondenceCombining Small Groups Within 5Addition Within 10Addition Within 20Two-Digit Addition Without RegroupingTwo-Digit Addition with RegroupingAddition Within 100Repeated Addition as MultiplicationMultiplication Facts Within 100Division as Equal SharingDivision as Grouping (Measurement Division)Division: Grouping (Repeated Subtraction) ModelDivision: Fair Sharing ModelDivision as Equal SharingDivision as GroupingBasic Division FactsDivision Facts Within 100Two-Digit by One-Digit DivisionDivision with RemaindersRemainders and Quotients in DivisionDivision Word ProblemsIntroduction to Long DivisionFactors and MultiplesPrime and Composite NumbersEquivalent FractionsRelating Fractions and DecimalsDecimal Place ValueReading and Writing DecimalsComparing and Ordering DecimalsAdding and Subtracting DecimalsMultiplying DecimalsDividing DecimalsDividing FractionsMixed Number ArithmeticOrder of OperationsInteger Order of OperationsVariable ExpressionsCombining Like TermsOne-Step EquationsTwo-Step EquationsSolving Multi-Step EquationsEquations with Variables on Both SidesAngle Pairs: Complementary, Supplementary, and VerticalParallel Lines and TransversalsCorresponding AnglesAlternate Interior AnglesTriangle Angle Sum TheoremExterior Angle TheoremTriangle Inequality TheoremSimilar Triangles: AA SimilaritySimilar Triangles: SSS and SAS SimilarityProportions in Similar TrianglesRight Triangle Trigonometry IntroductionTrigonometric Ratios ReviewRadian MeasureConverting Between Degrees and RadiansThe Unit CircleGraphing Sine and CosineGraphing Tangent and Reciprocal Trigonometric FunctionsDerivatives of Trigonometric FunctionsAntiderivativesIndefinite IntegralsBasic Integration RulesRiemann SumsDefinite Integral DefinitionProbability Density Functions and Continuous DistributionsCumulative Distribution FunctionsContinuous Random VariablesProbability Density FunctionsCausal Inference from Observational Data

Longest path: 75 steps · 430 total prerequisite topics

Prerequisites (6)

Leads To (8)