A researcher finds p = .03 for the effect of a memory training program. Which interpretation is correct?
AThere is a 3% probability that the null hypothesis is true
BThere is a 97% probability that the training program works
CIf the null hypothesis were true, the probability of observing results this extreme or more extreme is 3%
DThe training program explains 3% of the variance in memory scores
The p-value is defined as the probability of observing data at least as extreme as the obtained results, *assuming the null hypothesis is true*. It does not tell you the probability that the null is true, the probability the result is a fluke, or the size of the effect. Options A and B describe the 'inverse probability fallacy' — confusing P(data | null) with P(null | data). Option D describes r², a measure of effect size, not a p-value.
Question 2 True / False
A statistically significant result (p < .05) means the effect is large enough to matter in practical or clinical terms.
TTrue
FFalse
Answer: False
Statistical significance depends on three things: effect size, sample size, and variability. With a large enough sample, even a trivially small effect — one too small to be practically meaningful — will produce p < .05. For example, a study of 50,000 participants might find a statistically significant difference of 0.2 points on a 100-point scale. 'Significant' in statistics means 'unlikely to be due to chance,' not 'important.' Effect size measures (Cohen's d, r², η²) are needed to evaluate practical importance.
Question 3 Short Answer
What is the difference between a Type I error and a Type II error in hypothesis testing, and which one does the .05 significance threshold directly control?
Think about your answer, then reveal below.
Model answer: A Type I error is rejecting the null hypothesis when it is actually true (a false positive). A Type II error is failing to reject the null when it is actually false (a false negative). The .05 significance threshold (α) directly controls the Type I error rate — by setting α = .05, researchers accept a 5% chance of falsely rejecting a true null hypothesis. Type II error rate (β) is controlled separately through study design, primarily by ensuring adequate statistical power (1 - β), which depends on sample size.
This distinction matters because researchers face a trade-off: lowering α to reduce false positives (e.g., α = .01) simultaneously increases the Type II error rate unless sample size is increased. The choice of α = .05 is a convention, not a law of nature — in some fields (particle physics, clinical trials with serious consequences) much stricter thresholds are used. Understanding that the .05 threshold controls only one type of error helps researchers design studies with appropriate power rather than just aiming for p < .05.