A doctor knows that P(positive test | disease) = 0.95. She wants to compute P(disease | positive test). What additional information does she need?
AOnly the total number of patients tested
BP(disease) — the prior probability of having the disease — and P(positive test) — the overall probability of testing positive
CNothing — P(disease | positive test) equals P(positive test | disease) by symmetry
DThe sample size, because Bayes' theorem only applies to large datasets
Bayes' theorem states P(B|A) = P(A|B)·P(B)/P(A). To find P(disease|positive), you need P(positive|disease) (the sensitivity, already known), P(disease) (the base rate — how common is the disease?), and P(positive) (overall probability of a positive result, computed via the law of total probability). Without the base rate, the calculation is impossible — and forgetting it is the source of the base rate fallacy.
Question 2 True / False
A disease test has a 95% true positive rate: P(positive test | disease) = 0.95. Therefore, if a person tests positive, there is a 95% chance they have the disease.
TTrue
FFalse
Answer: False
This is the classic base rate fallacy — confusing P(positive | disease) with P(disease | positive). These are not equal. If the disease is rare (e.g., affects 0.1% of the population), even a test with 95% sensitivity will produce many more false positives than true positives among people who test positive. The actual posterior probability P(disease | positive) can be far lower than 95% when the prior P(disease) is small.
Question 3 Short Answer
In Bayes' theorem P(B|A) = P(A|B)·P(B) / P(A), what role does the denominator P(A) play?
Think about your answer, then reveal below.
Model answer: P(A) is the total probability of observing evidence A — it accounts for all ways A could occur, whether or not B is true. It acts as a normalizing constant, ensuring the posterior P(B|A) is a valid probability between 0 and 1.
Without dividing by P(A), the numerator P(A|B)·P(B) gives the joint probability P(A∩B), not the conditional probability P(B|A). Dividing by P(A) rescales this to be relative to the event A having occurred. In practice P(A) is computed using the law of total probability: P(A) = P(A|B)·P(B) + P(A|B^c)·P(B^c).