Questions: Interpretation and Marginal Effects in Nonlinear Models
5 questions to test your understanding
Score: 0 / 5
Question 1 Multiple Choice
A logit regression on voter turnout yields a coefficient of 0.25 on annual income (in thousands of dollars). What does this coefficient directly represent?
AA $1,000 increase in income raises the probability of voting by 25 percentage points
BA $1,000 increase in income raises the log-odds of voting by 0.25
CA $1,000 increase in income multiplies the odds of voting by 0.25
DA $1,000 increase in income raises the predicted probability of voting by 0.25 times the baseline probability
Logit coefficients measure the change in log-odds (the natural log of the probability of success divided by the probability of failure) per unit change in X. The log-odds scale is where the logit model is linear in the parameters, which is why MLE estimation works. However, log-odds are not probabilities — translating this coefficient into a probability effect requires computing the marginal effect, which depends on the baseline probability through the logistic function's derivative. Option A (25 percentage points) is the classic mistake of reading logit coefficients as if they were OLS regression coefficients. Option C incorrectly states multiplicative odds (that would be exp(0.25) ≈ 1.28, not 0.25).
Question 2 Multiple Choice
Two individuals are modeled in a logit regression. Individual A has a baseline predicted probability of 20% and Individual B has a baseline predicted probability of 70%. Both have the same coefficient β on variable X. Which individual has the larger marginal effect of X on their predicted probability?
AIndividual A, because people with lower baseline probabilities have more room to move upward
BIndividual B, because 70% is closer to the 50% midpoint where the logistic curve has its steepest slope
CBoth have identical marginal effects, since the coefficient β is the same for all observations
DIt depends on the direction of the coefficient — positive coefficients favor A, negative favor B
The marginal effect of X on probability in a logit model is β × p(1−p), where p is the predicted probability. At p = 0.20: ME = β × 0.20 × 0.80 = 0.16β. At p = 0.70: ME = β × 0.70 × 0.30 = 0.21β. So Individual B, at 70%, has the larger marginal effect. The logistic function is steepest at p = 0.5 (where ME = 0.25β) and flatter toward both extremes. Counterintuitively, the person with 20% probability is NOT closer to the center — 70% is closer to 50% than 20% is — so B's marginal effect is larger. Option A reflects the misconception that low-probability individuals always have the most 'room to move.' Option C is the core mistake: same coefficient does NOT mean same probability-scale marginal effect.
Question 3 True / False
Average marginal effects (AME) and marginal effects at the mean (MEM) typically produce the same estimate in logit and probit models.
TTrue
FFalse
Answer: False
False. AME and MEM differ whenever the relationship between covariates and outcomes is nonlinear — which is always the case in logit and probit. MEM computes the marginal effect at a single hypothetical 'average person' (plugging in mean values of all covariates). AME computes the marginal effect separately for each individual in the sample using their actual covariate values, then averages those effects. Because p(1−p) is a nonlinear function, averaging and then evaluating is not the same as evaluating and then averaging. AME is generally preferred because the 'average person' may not correspond to any real individual, and AME appropriately weights the distribution of baseline probabilities across the sample.
Question 4 True / False
In a probit or logit model, reporting just the coefficient value without computing marginal effects is sufficient for a reader to judge the practical importance of a variable.
TTrue
FFalse
Answer: False
False. A logit coefficient of 2.0 versus 0.1 signals relative importance, but neither coefficient tells you whether the variable raises probability by 0.5 percentage points or 15 percentage points — that depends entirely on where in the S-curve the typical observation sits. A large coefficient at an extreme predicted probability (near 0 or 1) may produce tiny probability-scale effects; a small coefficient near p = 0.5 may produce substantial effects. Without marginal effects, the reader cannot assess practical significance. This is analogous to reporting a standardized test score without the raw score — the internal comparison is meaningful but the external interpretation requires translation.
Question 5 Short Answer
Explain why the same logit coefficient β produces different marginal effects on predicted probability for different individuals, and describe how the average marginal effect (AME) accounts for this.
Think about your answer, then reveal below.
Model answer: In a logit model, the marginal effect on probability is β × p(1−p), where p is the individual's predicted probability. Because p(1−p) varies with p — it is largest at p = 0.5 and approaches zero near p = 0 or p = 1 — the same β maps to different probability-scale effects depending on where each person sits on the logistic S-curve. AME accounts for this by computing β × p_i(1−p_i) for each individual i using their own predicted probability, then averaging these individual-specific marginal effects across the full sample. This gives a single summary effect that respects the actual distribution of baseline probabilities in the data.
The practical implication is that reporting only the AME can still be misleading if the sample is heterogeneous. In a voting example, the average marginal effect of education might be 2 percentage points, but for low-income voters already near certain abstention (p ≈ 0.05), the marginal effect might be 0.4 pp, while for engaged voters near the margin (p ≈ 0.5), it could be 5 pp. Disaggregating by subgroup reveals this heterogeneity that the AME compresses.