A topic in the Open Knowledge Graph — a free, open map of 15,290 topics and the order to learn them in.

Interpretation and Marginal Effects in Nonlinear Models

Graduate Depth 112 in the knowledge graph ☐ I know this ☆ Set as goal

1topic build on this

648prerequisites beneath it

Logit and Probit Models for Binary Outcomes Maximum Likelihood Estimation +1 more→→Marginal Effects and Partial Effects Measurement

Core Idea

In logit, probit, and other nonlinear models, raw coefficients do not represent marginal effects on the outcome. The effect of a unit change in X depends on both the coefficient and the probability/density evaluated at specific covariate values.

How It's Best Learned

Calculate marginal effects at the mean (MEM) and average marginal effects (AME) for a few key variables. Use plots to show how predicted probabilities change across the range of X.

Explainer

In a linear regression model, the coefficient β on a variable X has a clean interpretation: a one-unit increase in X shifts the predicted outcome by exactly β, regardless of where X starts, who the observation is, or what other variables look like. This constant-effect property is what makes linear regression coefficients so easy to communicate. Nonlinear models like logit and probit, which you studied as prerequisites, trade away this simplicity in exchange for a more appropriate model of binary outcomes — and the price is that interpretation requires an extra step.

In a logit model, the coefficient β on X tells you how much the log-odds (the log of the probability of success divided by the probability of failure) changes for a one-unit increase in X. The log-odds scale is linear in the parameters, which is why maximum likelihood estimation works cleanly. But log-odds are not probabilities, and the translation from log-odds to probabilities is nonlinear — it runs through the logistic function, which produces the familiar S-shaped curve. This means the effect of X on the probability of success depends on where on the S-curve you are sitting. Near the tails (very high or very low predicted probabilities), the curve is nearly flat, so a coefficient of 0.5 on X translates into a very small probability change. Near the middle of the curve (baseline probability around 0.5), the same coefficient translates into a much larger probability change.

This is why raw logit or probit coefficients should never be directly interpreted as probability effects. Instead, economists compute marginal effects — the derivative of the predicted probability with respect to X, evaluated at specific covariate values. Two approaches are standard. Marginal effects at the mean (MEM) evaluate the derivative at the sample mean of each covariate: you plug in the average age, average income, average education level, and compute the probability change for a one-unit shift in X at that hypothetical "average" individual. Average marginal effects (AME) compute the derivative for every observation in the sample using their actual covariate values, then average those individual effects. AME is generally preferred because the "average individual" may not represent anyone in the data — averages of many characteristics may not correspond to any real person.

Consider a concrete example: estimating the effect of years of education on the probability of voting. A logit coefficient of 0.2 on education means log-odds increase by 0.2 per year of education. But for someone currently at a 20% baseline voting probability, this might translate into a 3 percentage-point increase per year of education. For someone at a 70% baseline probability, the same coefficient might translate into only 1.5 percentage points. The AME across the full sample might be 2.2 percentage points — that is the number you would report and discuss. Discrete change effects extend this logic to dummy variables: for a binary X (e.g., college degree vs. no degree), you compute the change in predicted probability when X switches from 0 to 1, rather than taking a derivative. The same nonlinearity applies, reinforcing that no single number captures the "effect" of a variable — context always determines magnitude.

Practice Questions 5 questions

Prerequisite Chain

Understanding Zero → The Number Zero → Counting to Five → Counting to 10 → Counting to 20 → Counting a Set of Objects Up to 20 → Cardinality: The Last Number Counted → Matching Numerals to Quantities → Subitizing Small Quantities → Addition Within 10 → Number Bonds to 10 → Addition Within 20 → Doubles and Near Doubles → Doubles Facts Within 10 → Near Doubles Facts Within 20 → Mental Math Strategies for Addition → Mental Math: Adding and Subtracting Tens → Addition Within 100 → Repeated Addition as Multiplication → Multiplication as Equal Groups → Multiplication: Arrays → Basic Multiplication Facts (0s, 1s, 2s, 5s, 10s) → Multiplication Facts Within 100 → Division as Equal Sharing → Division as Grouping (Measurement Division) → Division: Grouping (Repeated Subtraction) Model → Division: Fair Sharing Model → Division as Equal Sharing → Division as Grouping → Basic Division Facts → Division Facts Within 100 → Multiplication and Division Fact Families → Relationship Between Multiplication and Division → Division Facts as Inverse of Multiplication → Remainders and Quotients in Division → Division Word Problems → Multi-Step Word Problems → Solving Multi-Step Word Problems → Multiplication Word Problems → Division Word Problems → Introduction to Long Division → Factors and Multiples → Prime and Composite Numbers → Equivalent Fractions → Relating Fractions and Decimals → Decimal Place Value → Integers and the Number Line → Comparing and Ordering Integers → Absolute Value → Adding Integers → Subtracting Integers → Multiplying Integers → Dividing Integers → Unit Rates → Proportions → Percent Concept → Converting Between Fractions, Decimals, and Percents → Operations with Rational Numbers → Two-Step Equations → Solving Multi-Step Equations → Equations with Variables on Both Sides → Angle Pairs: Complementary, Supplementary, and Vertical → Parallel Lines and Transversals → Corresponding Angles → Alternate Interior Angles → Triangle Angle Sum Theorem → Exterior Angle Theorem → Triangle Inequality Theorem → Similar Triangles: AA Similarity → Similar Triangles: SSS and SAS Similarity → Proportions in Similar Triangles → Right Triangle Trigonometry Introduction → Sine, Cosine, and Tangent Ratios → Trigonometric Ratios Review → Radian Measure → Converting Between Degrees and Radians → The Unit Circle → Graphing Sine and Cosine → Graphing Tangent and Reciprocal Trigonometric Functions → Derivatives of Trigonometric Functions → Antiderivatives → Indefinite Integrals → Basic Integration Rules → Riemann Sums → Definite Integral Definition → Probability Density Functions and Continuous Distributions → Cumulative Distribution Functions → Continuous Random Variables → Probability Density Functions → Expected Value → Weak Law of Large Numbers → Probability Axioms and Rules → Conditional Probability → Independence of Events → Sampling Distributions → Standard Error of Estimators → Hypothesis Testing: Framework and Logic → P-values and Statistical Significance → Effect Size and Practical Significance → Hypothesis Testing: Framework and Logic → Z-Tests and T-Tests for Means → One-Sample Z-Test for Means → One-Sample and Two-Sample T-Tests → Inference in Linear Regression → Prediction Intervals in Regression → Linear Regression Basics → Residuals and Goodness of Fit (R²) → Simple (Bivariate) OLS Regression → Classical OLS Assumptions (Gauss-Markov) → Multiple Regression → Interpreting Regression Coefficients → Polynomial Regression and Nonlinear Functional Forms → Interpretation and Marginal Effects in Nonlinear Models

Longest path: 113 steps · 648 total prerequisite topics

Prerequisites (3)

Logit and Probit Models for Binary Outcomeshard Maximum Likelihood Estimationhard Polynomial Regression and Nonlinear Functional Formssoft

Leads To (1)

Marginal Effects and Partial Effects Measurementhard