4 questions to test your understanding
The fundamental problem of causal inference states that we can never observe both potential outcomes for the same individual. Why does randomization solve this problem at the population level even though it cannot solve it at the individual level?
A directed acyclic graph (DAG) shows that Socioeconomic Status (SES) causes both Exercise (treatment) and Heart Disease (outcome). A researcher adjusts for SES in a regression. According to the DAG, is this sufficient to identify the causal effect of Exercise on Heart Disease?
Adjusting for a collider (a variable caused by both treatment and outcome) in a regression introduces bias rather than removing it.
Explain the SUTVA (Stable Unit Treatment Value Assumption) and give a biostatistical example of when it would be violated.