A researcher estimates the direct effect of exercise (A) on cardiovascular disease (Y) by controlling for BMI (M, the proposed mediator). Exercise also causes inflammation (L), which confounds the BMI→CVD relationship. What is the key problem with simply adding M to the regression?
AAdding M causes multicollinearity, inflating standard errors for the direct effect estimate
BConditioning on M opens a collider path through L, biasing both the direct and indirect effect estimates
CThe direct effect cannot be estimated without additional data on physical fitness levels
DAdding M removes the indirect effect cleanly, leaving the direct effect correctly estimated
When L is affected by A and also confounds M→Y, conditioning on M in a regression creates a collider-stratification bias — opening a backdoor path through L. This is exposure-induced mediator-outcome confounding, one of the central scenarios where standard regression fails for mediation analysis. The common wrong intuition (option D) assumes that controlling for M is always safe for estimating the direct effect; this only holds when no variable affected by the exposure also confounds the mediator-outcome relationship.
Question 2 Multiple Choice
A linear mediation model estimates: M = 0.4A + ε₁ and Y = 0.3A + 0.5M + ε₂. What is the indirect effect of A on Y through M, using the product method?
A0.3 — the direct path coefficient from A to Y
B0.5 — the path coefficient from M to Y
C0.2 — the product of the A→M and M→Y path coefficients
D0.7 — the sum of the direct effect and the mediator coefficient
The product method estimates the indirect effect as α₁ × β₂ = 0.4 × 0.5 = 0.2. The direct effect is β₁ = 0.3, and the total effect is 0.5. Option A is the direct effect alone. Option B (0.5) is the M→Y coefficient — the effect of the mediator on the outcome — not the indirect effect (which must also account for how much A moves M). Option D adds two quantities that don't belong together. In linear models, the product method and the difference method yield the same indirect effect.
Question 3 True / False
Controlling for a mediator in a standard regression model typically removes the indirect effect without introducing bias into the direct effect estimate.
TTrue
FFalse
Answer: False
This is false when exposure-induced mediator-outcome confounding is present — that is, when the exposure causes a variable that also confounds the mediator-outcome relationship. In that case, conditioning on the mediator opens a collider path, introducing bias into both the direct and indirect effect estimates. Standard regression can only recover valid natural direct and indirect effects when all four identification assumptions hold, including no such confounders. The solution requires weighting methods or the interventional effects framework.
Question 4 True / False
In linear regression models, the product method (α₁ × β₂) and the difference method for estimating indirect effects yield the same answer.
TTrue
FFalse
Answer: True
In linear models, both methods give the same indirect effect estimate. The product method multiplies the A→M path by the M→Y path. The difference method subtracts the direct effect coefficient (from the model including M) from the total effect coefficient (from the model without M). The algebraic equivalence breaks down in non-linear settings — binary outcomes, survival data — where the two methods diverge and the product method is no longer a valid estimate of the natural indirect effect.
Question 5 Short Answer
Why does the presence of exposure-induced mediator-outcome confounding invalidate standard regression approaches to mediation analysis, and what does this imply about when mediation analysis is valid?
Think about your answer, then reveal below.
Model answer: Exposure-induced mediator-outcome confounding occurs when the exposure causes a variable (L) that also confounds the mediator-outcome relationship. Conditioning on the mediator (as required to estimate the direct effect) simultaneously conditions on a collider with respect to L, opening a non-causal path that biases the estimates. This means the standard Baron-Kenny/product method regression approach is invalid anytime the exposure has downstream effects on any M-Y confounder. Valid mediation analysis in this scenario requires weighting methods (marginal structural models) or interventional effects defined without cross-world counterfactuals.
The core issue is that the mediator is not just an intermediate variable — it can be entangled with the confounding structure in ways that make conditioning on it harmful rather than helpful. Recognizing when this applies (any time the exposure causally affects anything that also confounds M→Y) is the practical skill that separates valid from invalid mediation analyses in the applied literature. The general lesson: opening one backdoor path to block confounding can simultaneously open another.