Questions: Group Sequential Methods for Clinical Trials
4 questions to test your understanding
Score: 0 / 4
Question 1 Multiple Choice
A trial with 4 planned interim analyses uses O'Brien-Fleming boundaries. At the first interim, the boundary requires p < 0.0005 to stop for efficacy. At the final analysis, the boundary requires p < 0.041. Why are the early boundaries so much more stringent?
AEarly data are less reliable and need stricter thresholds
BO'Brien-Fleming boundaries spend very little alpha early (when estimates are imprecise) and concentrate alpha at the final analysis (when estimates are most precise), reflecting that early stopping should require overwhelming evidence
CThe boundaries are set to ensure exactly 5% of patients are stopped early
DStricter early boundaries reduce the sample size
O'Brien-Fleming boundaries are designed to be very conservative early and nearly match the unadjusted alpha at the end. The logic is that interim estimates are based on partial data and have wide confidence intervals — stopping early based on imprecise evidence is risky. By requiring near-certainty (p < 0.0005) for early stopping but relaxing to p ≈ 0.041 at the final analysis, O'Brien-Fleming boundaries preserve most of the trial's power while allowing early stopping only when the evidence is compelling.
Question 2 True / False
A trial has three interim analyses and a final analysis. The Data Safety Monitoring Board (DSMB) decides to add a fifth unplanned interim analysis after observing concerning safety signals. The alpha-spending function approach can accommodate this without invalidating the trial.
TTrue
FFalse
Answer: True
The alpha-spending function (Lan-DeMets approach) is designed for exactly this flexibility. It defines how alpha is 'spent' as a continuous function of information fraction (proportion of total planned events or patients enrolled), rather than requiring a fixed number of equally-spaced analyses. An unplanned interim analysis simply evaluates the spending function at the current information fraction, determining the appropriate boundary. This makes the alpha-spending approach more flexible than fixed group sequential boundaries, which require the number and timing of analyses to be specified in advance.
Question 3 True / False
Stopping a trial early for efficacy based on group sequential boundaries guarantees that the treatment effect estimate reported from the trial is unbiased.
TTrue
FFalse
Answer: False
Early stopping for efficacy creates a selection bias in the treatment effect estimate: the trial stops precisely because the interim estimate was large enough to cross the boundary. This means the reported effect is systematically overestimated — the estimate that triggered stopping is, on average, larger than the true effect. This is sometimes called the 'winner's curse' or estimation bias of sequential designs. Bias-adjusted estimators (e.g., median unbiased estimates, confidence interval methods of Jennison and Turnbull) should be reported alongside the boundary-crossing test statistic.
Question 4 Short Answer
Explain why a trial that is stopped early for futility (the treatment is unlikely to show benefit even with the full sample) is ethically justified even though it does not produce a definitive conclusion.
Think about your answer, then reveal below.
Model answer: If conditional power calculations at an interim analysis show that even with the full planned sample, the probability of achieving statistical significance is very low (e.g., <10%), continuing the trial will expose additional patients to a treatment that is very unlikely to be shown effective. The ethical principle of minimizing harm to research participants justifies stopping: continuing enrollment subjects patients to the risks of an experimental treatment without a reasonable prospect of generating useful evidence. Futility stopping also conserves resources that can be directed to more promising research.
Futility boundaries are typically non-binding (advisory) rather than binding (mandatory) because the decision to stop for futility involves clinical judgment beyond the statistical threshold — the treatment may have important secondary endpoints or safety profile data that justify continued enrollment even if the primary endpoint is unlikely to reach significance.