A researcher gets p = 0.03 and concludes: 'There is only a 3% chance that the null hypothesis is true.' What is wrong with this interpretation?
ANothing — a p-value of 0.03 is defined as the probability H₀ is true
BThe p-value of 0.03 means there is a 97% chance the alternative hypothesis is true
CThe p-value is P(data this extreme | H₀ true), not P(H₀ is true | this data)
DThe threshold should be 0.01 for any valid conclusion about H₀
This is the most common p-value misconception. The p-value conditions on H₀ being true and asks how extreme the data would be — it is P(data | H₀). To get P(H₀ | data), you would need Bayes' theorem and a prior probability for H₀, which frequentist hypothesis testing deliberately avoids. A p-value of 0.03 means: if H₀ were true, there would be a 3% chance of seeing data this extreme or more so. It says nothing directly about the probability that H₀ is true.
Question 2 Multiple Choice
A study with n = 1,000,000 participants finds a statistically significant result (p < 0.001) showing that a new drug reduces blood pressure by an average of 0.1 mmHg. What is the most accurate conclusion?
AThe drug has a large, clinically meaningful effect
BThe study is definitive proof of the drug's effectiveness
CThe result is statistically significant but the effect may be too small to be clinically relevant
DWith p < 0.001, the null hypothesis must be false
With a very large sample, even a tiny effect will produce a very small p-value — statistical significance is partly a function of sample size. A 0.1 mmHg reduction in blood pressure is almost certainly clinically meaningless (normal variation in a single reading can be 10–20 mmHg). Statistical significance tells you the effect is distinguishable from zero; it says nothing about whether the effect is large enough to matter. Effect size measures (not p-values) determine practical significance.
Question 3 True / False
A p-value of 0.03 means that, if the null hypothesis were true, data as extreme as observed would occur only 3% of the time.
TTrue
FFalse
Answer: True
This is the correct definition of a p-value. It is the tail probability of observing data at least as extreme as what was observed, computed under the assumption that H₀ is true. A p-value of 0.03 tells you the data sit in the outer 3% of the null distribution — they are relatively unlikely under H₀, which is why small p-values prompt rejection of H₀.
Question 4 True / False
A p-value of 0.40 is evidence that the null hypothesis is true.
TTrue
FFalse
Answer: False
Absence of evidence is not evidence of absence. A large p-value means the data are not sufficiently extreme to reject H₀ at your chosen threshold — that is all. It does not mean H₀ is true, and it does not mean the effect is zero. A study with too small a sample may fail to achieve significance even when a real effect exists (this is low statistical power). The correct interpretation of p = 0.40 is: 'we cannot reject H₀' — not 'H₀ is confirmed.'
Question 5 Short Answer
Explain why a very small p-value does not necessarily imply that a research finding is practically important.
Think about your answer, then reveal below.
Model answer: P-values depend on both effect size and sample size. With a large enough sample, even a trivially small effect will produce an arbitrarily small p-value, because large samples reduce sampling variability and make the test very sensitive to any departure from H₀. Statistical significance just means the data are inconsistent with H₀ — it does not indicate the size or real-world relevance of the effect. Practical importance requires effect size measures (Cohen's d, r², etc.) that quantify how large the effect is, not how detectable it is.
The p-value answers 'is this effect detectable?' not 'is this effect large?' These are different questions. A drug that reduces blood pressure by 0.01 mmHg in a study of one million patients will produce p < 0.001, yet the drug is useless clinically. A therapy that reduces depression scores by 15 points might fail significance in a study of 10 patients, yet the effect could be enormous. Always pair p-values with effect sizes.