A new prediction model for sepsis has an AUC of 0.82, substantially better than the existing clinical score (AUC = 0.74). A decision curve analysis is run. At the threshold range used in clinical practice (5%–15%), the new model's DCA curve lies below the 'treat all' reference line. What should you conclude?
AThe new model should be adopted because its AUC is meaningfully higher
BThe new model provides no clinical benefit over simply treating all high-risk patients at this threshold range
CThe AUC comparison is more reliable than DCA for evaluating clinical utility
DThe DCA result is invalid because the threshold range is too narrow
When a model's DCA curve falls below the 'treat all' line at the clinically relevant threshold range, using the model provides less net benefit than simply treating everyone — making it clinically useless despite its higher AUC. AUC summarizes discrimination across all thresholds simultaneously, ignoring the relative costs of false positives and false negatives. DCA evaluates whether the model improves on the simplest possible strategies (treat all, treat none) at the threshold a clinician would actually use. High AUC is not sufficient for clinical utility.
Question 2 Multiple Choice
What does the decision threshold (p_t) in decision curve analysis represent?
AThe probability cutoff at which the model's sensitivity equals its specificity
BThe minimum AUC required for the model to be considered valid
CThe disease probability at which a clinician is indifferent between treating and not treating
DThe prevalence of disease in the study population
The decision threshold encodes the clinician's implicit judgment about the relative harm of a false positive versus a false negative. At p_t = 10%, a clinician is willing to treat 9 disease-free patients to avoid missing one case — the expected harm of unnecessary treatment equals the expected harm of missing disease. This threshold is determined by clinical context (disease severity, treatment side-effects), not by statistical properties of the test. It is what makes DCA clinically grounded rather than purely statistical.
Question 3 True / False
A diagnostic model can have a high AUC and still provide no clinical benefit over treating everyone, depending on the decision threshold.
TTrue
FFalse
Answer: True
True. AUC averages performance across all possible thresholds, weighting them equally. But if a model's net benefit at clinically relevant thresholds falls below the 'treat all' line, it offers no practical advantage despite strong overall discrimination. The 'treat all' strategy — giving the intervention to every patient — performs well at low thresholds because it catches every case; a model only adds value if it reduces unnecessary treatments without missing too many cases. High AUC guarantees good discrimination on average, not clinical utility at the specific threshold that matters.
Question 4 True / False
Decision curve analysis plots sensitivity on the y-axis against 1 − specificity on the x-axis across decision thresholds.
TTrue
FFalse
Answer: False
False. That is the ROC curve. DCA plots net benefit on the y-axis against decision threshold (probability) on the x-axis. Net benefit = (true positives / N) − (false positives / N) × (p_t / (1 − p_t)), incorporating the clinical cost ratio of false positives relative to false negatives. This distinction is fundamental: ROC curves summarize discrimination without reference to clinical context, while DCA directly addresses whether using the test is better than not using it, given how much harm a false positive costs relative to a false negative.
Question 5 Short Answer
Why does decision curve analysis include 'treat all' and 'treat none' as reference lines, and what happens to the 'treat all' line as the threshold increases?
Think about your answer, then reveal below.
Model answer: The reference lines represent the simplest possible strategies: intervene for every patient regardless of test result, or intervene for none. A test is only clinically valuable if it outperforms both — if it can't beat treating everyone or treating no one, there is no reason to use it. The 'treat all' line decreases as threshold rises because at a high threshold you are implying that false positives are very costly; treating everyone incurs those costs on every disease-free patient, producing large negative net benefit.
This framing is what separates DCA from purely statistical metrics. A test that looks good on AUC might still be worse than 'treat all' at low thresholds (where the condition is dangerous and treatment is safe) or worse than 'treat none' at high thresholds (where treatment is harmful and the condition is mild). By plotting both reference lines, DCA forces a direct answer to the clinical question: does this model improve on trivial strategies at the threshold that actually governs practice?