Diagnostic test evaluation quantifies how well a test distinguishes between disease and non-disease. Sensitivity (true positive rate) is the probability that the test is positive given disease is present; specificity (true negative rate) is the probability that the test is negative given disease is absent. These are intrinsic properties of the test. Predictive values — positive predictive value (PPV: probability of disease given a positive test) and negative predictive value (NPV: probability of no disease given a negative test) — depend critically on disease prevalence in the tested population. A test with 99% sensitivity and 99% specificity has a PPV of only 50% when prevalence is 1%, because false positives outnumber true positives in low-prevalence populations. This prevalence dependence of predictive values, formalized by Bayes' theorem, is among the most counterintuitive and consequential results in clinical biostatistics.
Every diagnostic test makes errors. It will occasionally miss true cases (false negatives) and occasionally flag healthy people as diseased (false positives). The question is how often, and — critically — how these error rates translate into clinical consequences for the patient sitting in front of you. Sensitivity and specificity quantify the test's intrinsic performance: sensitivity measures how well the test catches disease (P(positive | disease)), and specificity measures how well it rules it out (P(negative | no disease)). These are determined by the test's biology, threshold settings, and technical characteristics, and they remain the same regardless of who is being tested.
But the patient does not know their disease status — that is why they are being tested. The clinically relevant question is: given that the test came back positive, how likely is it that the patient actually has the disease? This is the positive predictive value (PPV), and it depends not just on the test's sensitivity and specificity but also on the prevalence of the disease in the population being tested. This dependence is counterintuitive and has profound implications for screening policy.
Consider a population of 10,000 where disease prevalence is 1% (100 have disease, 9,900 do not). A test with 95% sensitivity detects 95 of 100 true cases. A test with 95% specificity correctly clears 9,405 of 9,900 non-cases but produces 495 false positives. Total positives: 95 + 495 = 590. PPV = 95/590 = 16%. Even with a test that sounds excellent (95%/95%), more than 5 out of 6 positive results are wrong in a low-prevalence population. Drop prevalence to 0.1% and the PPV plummets further.
This is why clinical diagnosis uses a sequential testing strategy: start with sensitive tests to rule out disease cheaply (SnNout — Sensitivity, Negative result, rules Out), then confirm positives with specific tests to rule in disease (SpPin — Specificity, Positive result, rules In). The first step maximizes NPV (few false negatives among negative results); the second step maximizes PPV among the enriched population that screened positive. Understanding this Bayesian logic — that the meaning of a test result depends on the prior probability of disease — is essential for every clinical decision involving diagnostic testing.