Questions: Instrumental Variables in Biostatistics
3 questions to test your understanding
Score: 0 / 3
Question 1 Multiple Choice
Mendelian randomization uses genetic variants associated with alcohol consumption to estimate the causal effect of alcohol on cardiovascular disease. Why are genetic variants considered better instruments than, say, alcohol taxation levels?
AGenetic variants have larger effect sizes on alcohol consumption
BGenetic variants are allocated randomly at conception (Mendel's second law), making them independent of the socioeconomic, behavioral, and environmental confounders that plague observational studies of alcohol use
CGenetic variants directly affect cardiovascular disease, providing a more accurate estimate
DTaxation levels are too unstable over time to be useful as instruments
The strength of Mendelian randomization is that genetic variants satisfy the independence condition naturally — they are randomly allocated during meiosis and are fixed at conception, before any lifestyle confounders develop. This means they are not confounded by socioeconomic status, diet, or other behaviors that confound observational studies of alcohol. Taxation could satisfy relevance but may violate independence (tax rates correlate with region, which correlates with health outcomes through many pathways). The key caveat is that the exclusion restriction must still hold — the genetic variant must affect CVD only through alcohol, not through other biological pathways (pleiotropy).
Question 2 True / False
An IV analysis of the effect of BMI on diabetes uses a genetic risk score for BMI as the instrument. The exclusion restriction requires that the genetic risk score affects diabetes only through BMI. If some of the genetic variants in the score directly affect insulin sensitivity (pleiotropy), the exclusion restriction is violated.
TTrue
FFalse
Answer: True
Pleiotropy — when a genetic variant affects the outcome through pathways other than the exposure of interest — is the primary threat to the exclusion restriction in Mendelian randomization. If genetic variants that raise BMI also directly affect insulin resistance (not through BMI but through shared metabolic pathways), the IV estimate will be biased because the instrument has a direct effect on the outcome. Methods like MR-Egger regression and weighted median estimation can detect and partially correct for pleiotropy, but they require additional assumptions.
Question 3 Short Answer
IV estimates a Local Average Treatment Effect (LATE) rather than the Average Treatment Effect (ATE). Explain this limitation and who the 'local' population is in a Mendelian randomization study.
Think about your answer, then reveal below.
Model answer: LATE is the causal effect for 'compliers' — the subpopulation whose treatment is actually influenced by the instrument. In Mendelian randomization of alcohol and CVD, the LATE estimates the causal effect of alcohol for people whose drinking is changed by the genetic variant. People who would drink heavily regardless of their genotype (always-takers) or abstain regardless (never-takers) are not included. This means the estimate may not generalize to the entire population if the treatment effect differs across subgroups. The LATE is a valid causal estimate for compliers but may not represent the effect of a universal policy change.
The LATE interpretation is often overlooked in applied IV studies. If the genetic variant shifts moderate drinkers to drink slightly less, the LATE captures the effect of a small reduction in alcohol for moderate drinkers — not the effect of heavy drinking versus abstinence, and not the effect for the entire population. This can lead to apparently paradoxical results where the IV estimate differs dramatically from the observational estimate, even after accounting for confounding, because they apply to different subpopulations.