A disease affects 1 in 10,000 people. A test for it is 99% sensitive (correctly detects the disease) and 99% specific (correctly rules it out). You test positive. Roughly what is the probability you actually have the disease?
AAbout 99%, because the test is 99% accurate
BAbout 50%, because a positive result is equally likely to be true or false
CAbout 1%, because false positives vastly outnumber true positives given the disease's rarity
DAbout 0.01%, because only 1 in 10,000 people have the disease
In 10,000 people: ~1 has the disease, detected correctly (true positive). Of the 9,999 healthy people, 1% — about 100 — test positive (false positives). So ~101 total positives, only 1 of which is a true positive: probability ≈ 1/101 ≈ 1%. The test is highly accurate, yet the positive predictive value is very low because the disease is so rare. The intuitive answer (option A) is the base rate neglect error — ignoring how rare the condition is.
Question 2 Multiple Choice
A prosecutor argues: 'The probability of a random person having the same DNA profile as the crime scene sample is 1 in 1,000,000. Therefore the defendant is almost certainly guilty.' What is wrong with this argument?
ANothing — a 1-in-a-million probability of an innocent match is overwhelming evidence of guilt
BIt confuses P(DNA match | innocent) with P(innocent | DNA match), ignoring the base rate of how many people could match
CDNA evidence is never reliable enough to use in court
DThe argument is valid only if the defendant had no alibi
This is the prosecutor's fallacy. P(DNA match | innocent) = 1/1,000,000 is the probability of finding this evidence if the person is innocent — not the probability of innocence given the evidence. To compute P(innocent | DNA match), you need Bayes' theorem, which requires knowing the base rate: in a city of 1 million people, roughly 1 person besides the perpetrator matches by chance. The posterior probability of guilt is far lower than the 'overwhelming' 999,999/1,000,000 the argument implies.
Question 3 True / False
A test that is 95% accurate will correctly diagnose 95% of people who test positive.
TTrue
FFalse
Answer: False
'95% accurate' typically means 95% sensitivity and/or specificity — the probability the test correctly identifies those who have or don't have the condition. The probability that a positive test result is correct (positive predictive value) depends also on the base rate of the condition. When the condition is rare (say, 1 in 10,000), even a 95% accurate test will generate far more false positives than true positives, and the chance that a positive result is correct can be well below 95%.
Question 4 True / False
The base rate of a condition in the relevant population affects how much weight you should give to a positive test result for that condition.
TTrue
FFalse
Answer: True
This is the core lesson of base rate reasoning. Bayes' theorem formalizes it: the posterior probability of a hypothesis depends on both the prior probability (the base rate) and the likelihood ratio of the evidence. A low base rate can dramatically reduce the posterior probability even when evidence is strong. Ignoring the base rate — treating only the test's accuracy as relevant — is the defining error of base rate neglect.
Question 5 Short Answer
Explain why a highly accurate test can still produce mostly false positives, and what factor is responsible.
Think about your answer, then reveal below.
Model answer: Test accuracy describes performance at the individual level — how often the test is correct given whether someone has the condition. But when a condition is rare in the population, the vast majority of people tested are healthy. Even a small false positive rate (say 1%) applied to a large pool of healthy people generates many false positives, while the true positive rate (say 99%) applied to the tiny group with the condition generates few true positives. The ratio of false to true positives is determined by the base rate: the rarer the condition, the more false positives overwhelm true positives, regardless of test accuracy.
The responsible factor is the prior probability — how common the condition is before the test is applied. High test accuracy only guarantees correct results conditional on knowing who has the condition. When the prior is low, Bayes' theorem tells us the posterior probability of disease remains low even after a positive test.