Type I and Type II Errors and Power

College Depth 52 in the knowledge graph I know this Set as goal
Unlocks 84 downstream topics
errors power

Core Idea

Type I error (α)=P(reject H₀|H₀ true). Type II error (β)=P(fail to reject|H₁ true). Power=1−β=P(reject|H₁ true). Larger samples and larger effect sizes increase power. α and β tradeoff: reducing α increases β for fixed n.

Explainer

From the hypothesis testing framework you already know, a test works by rejecting H₀ when a test statistic falls into a rejection region. The rejection region is chosen before seeing data. But nature presents two possible realities — H₀ is true, or H₁ is true — and no matter how careful you are, there are two distinct ways a test can be wrong. A Type I error is a false positive: you reject a null hypothesis that was actually true. A Type II error is a false negative: you fail to reject a null hypothesis that was actually false. Both errors are real risks, and the framework forces you to confront the tradeoff between them explicitly.

Think of it like a medical diagnostic test. A Type I error is diagnosing a healthy patient with a disease (false alarm). A Type II error is missing a disease that's really there (missed detection). The significance level α is the probability you're willing to tolerate for the false alarm; the quantity β is the probability of the missed detection. The power of a test, 1 − β, is the probability that the test correctly detects a real effect. High-power tests are sensitive; low-power tests often miss what they're looking for.

The tradeoff becomes concrete when you think geometrically. For a fixed distribution of the test statistic under H₀, making the rejection region smaller (stricter α) pushes the critical value further into the tail, which unavoidably *includes* more of the H₁ distribution in the non-rejection region — raising β and lowering power. There is no free adjustment that simultaneously shrinks both error rates without increasing the sample size. The only way to have both small α and small β (high power) is to collect more data, because larger samples make the sampling distributions narrower and easier to separate.

Effect size — how far the true parameter is from the null value — also drives power. A large true difference between H₀ and H₁ is inherently easier to detect; even a modest sample gives good power. A small effect size requires a large sample to distinguish from noise. In practice, a power analysis is done before collecting data: given a desired α, a target power (commonly 0.80 or 0.90), and an estimated effect size, it calculates the minimum sample size required. This is why understanding the α-β-power-n relationship matters beyond exam formulas — it directly governs the design of every experiment you will ever run.

Practice Questions 5 questions

Prerequisite Chain

Counting to 10Counting to 20Understanding ZeroThe Number ZeroCounting to FiveOne-to-One CorrespondenceCombining Small Groups Within 5Addition Within 10Addition Within 20Two-Digit Addition Without RegroupingTwo-Digit Addition with RegroupingAddition Within 100Repeated Addition as MultiplicationMultiplication Facts Within 100Division as Equal SharingDivision as Grouping (Measurement Division)Division: Grouping (Repeated Subtraction) ModelDivision: Fair Sharing ModelDivision as Equal SharingDivision as GroupingBasic Division FactsDivision Facts Within 100Two-Digit by One-Digit DivisionDivision with RemaindersRemainders and Quotients in DivisionDivision Word ProblemsIntroduction to Long DivisionFactors and MultiplesPrime and Composite NumbersEquivalent FractionsRelating Fractions and DecimalsDecimal Place ValueReading and Writing DecimalsComparing and Ordering DecimalsAdding and Subtracting DecimalsMultiplying DecimalsDividing DecimalsDividing FractionsMixed Number ArithmeticOrder of OperationsInteger Order of OperationsVariable ExpressionsFunction Notation ReviewRandom Variables: Definition and ClassificationJoint and Marginal DistributionsConditional Distributions of Random VariablesRandom VariablesSampling DistributionsHypothesis Testing: Framework and LogicP-values and Statistical SignificanceEffect Size and Practical SignificanceHypothesis Testing: Framework and LogicType I and Type II Errors and Power

Longest path: 53 steps · 208 total prerequisite topics

Prerequisites (1)

Leads To (4)