A topic in the Open Knowledge Graph — a free, open map of 15,290 topics and the order to learn them in.

Multiple Comparisons and Type I Error Rate Control

College Depth 109 in the knowledge graph ☐ I know this ☆ Set as goal

48topics build on this

548prerequisites beneath it

Conditional Probability Inferential Statistics in Psychology +4 more→→Exploratory and Confirmatory Analysis Strategies and Their Distinct Roles

Core Idea

Multiple comparisons problem occurs when researchers conduct numerous statistical tests within a single study, which inflates the family-wise Type I error rate (probability of at least one false positive) beyond the nominal alpha level. Each statistical test carries a probability of Type I error; conducting many tests mathematically increases the probability that at least one will be statistically significant by chance alone. Corrections including Bonferroni, Holm, false discovery rate (FDR), and permutation testing adjust p-values or alpha levels to maintain overall Type I error control. The appropriate severity of correction depends on whether tests are planned (confirmatory) versus exploratory.

How It's Best Learned

Simulate running multiple independent statistical tests where the null hypothesis is true and observe how often at least one reaches statistical significance.

Common Misconceptions

Bonferroni correction is always appropriate (actually, it can be overly conservative when tests are correlated). Multiple comparisons corrections only apply to many p-values from the same dataset (actually, any multiple tests of related hypotheses require correction).

Explainer

From inferential statistics, you know that a Type I error — rejecting a true null hypothesis — has probability α, conventionally set at .05. This means that if the null hypothesis is genuinely true, you'll obtain a "significant" result 5% of the time purely by chance. From your work on Type I and Type II error tradeoffs, you understand that setting α defines your tolerance for false positives in a single test. The multiple comparisons problem is what happens when you apply that single-test logic across an entire family of tests — and the conditional probability calculation that drives it follows directly from the probability foundations you already have.

Suppose you run 20 independent significance tests in a single study, each at α = .05, and all null hypotheses are actually true. What is the probability that at least one test reaches significance? Use the complement rule you know from conditional probability: 1 − (1 − .05)²⁰ ≈ 1 − .95²⁰ ≈ .64. With 20 independent tests of truly null effects, you'd observe at least one "significant" result about 64% of the time — in a universe of pure noise. This inflated rate is the family-wise error rate (FWER): the probability of at least one false positive across the family of tests. It grows rapidly: 10 tests yields roughly 40% FWER; 50 tests yields over 92%.

Bonferroni correction is the most conservative solution: divide the nominal α by the number of tests and require each individual test to reach that stricter threshold. For 20 tests, each test must clear p < .0025. This guarantees FWER ≤ .05 across the family, but at a cost: demanding much smaller p-values for each test increases the probability of Type II errors — real effects may be missed because they don't survive the heightened bar. Bonferroni assumes that all tests are independent; when tests are positively correlated (as they often are within a study, since they draw on the same participants), it becomes overly conservative — the actual FWER is already lower than .05 because the tests are not providing independent chances at a false positive.

The Holm procedure improves on Bonferroni by applying corrections sequentially. Rank your p-values from smallest to largest; compare the smallest to α/k, the second-smallest to α/(k−1), and so on, stopping when a test fails to reach its threshold. Every test that clears its step-down threshold is declared significant. Holm controls FWER as strictly as Bonferroni but is less conservative for the larger (less significant) p-values, so you recover some statistical power without sacrificing error control. For exploratory work where you are willing to tolerate a small proportion of false discoveries in exchange for more power to detect true ones, the false discovery rate (FDR) approach shifts the target: instead of controlling the probability of any false positive, it controls the expected proportion of significant findings that are false. The Benjamini-Hochberg procedure implements this and is standard in neuroimaging and genomics, where thousands of simultaneous tests make FWER control nearly impossible without destroying power entirely.

The underlying principle is that the right correction depends on your inferential goals and the structure of your tests. Pre-registered, theoretically motivated tests of specific hypotheses warrant less severe correction than post-hoc mining of a dataset for any significant association. When a researcher runs 50 correlations, finds 3 that survive α = .05, and reports only those 3, no correction applied to those 3 p-values can fix the problem — the issue is selective reporting, which makes the reported results uninterpretable regardless of what correction is applied. Multiple comparisons control is a statistical procedure that assumes honest reporting of the full family; it cannot substitute for transparency about how many tests were actually conducted.

Practice Questions 5 questions

Prerequisite Chain

Understanding Zero → The Number Zero → Counting to Five → Counting to 10 → Counting to 20 → Counting a Set of Objects Up to 20 → Cardinality: The Last Number Counted → Matching Numerals to Quantities → Subitizing Small Quantities → Addition Within 10 → Number Bonds to 10 → Addition Within 20 → Doubles and Near Doubles → Doubles Facts Within 10 → Near Doubles Facts Within 20 → Mental Math Strategies for Addition → Mental Math: Adding and Subtracting Tens → Addition Within 100 → Repeated Addition as Multiplication → Multiplication as Equal Groups → Multiplication: Arrays → Basic Multiplication Facts (0s, 1s, 2s, 5s, 10s) → Multiplication Facts Within 100 → Division as Equal Sharing → Division as Grouping (Measurement Division) → Division: Grouping (Repeated Subtraction) Model → Division: Fair Sharing Model → Division as Equal Sharing → Division as Grouping → Basic Division Facts → Division Facts Within 100 → Multiplication and Division Fact Families → Relationship Between Multiplication and Division → Division Facts as Inverse of Multiplication → Remainders and Quotients in Division → Division Word Problems → Multi-Step Word Problems → Solving Multi-Step Word Problems → Multiplication Word Problems → Division Word Problems → Introduction to Long Division → Factors and Multiples → Prime and Composite Numbers → Equivalent Fractions → Relating Fractions and Decimals → Decimal Place Value → Integers and the Number Line → Comparing and Ordering Integers → Absolute Value → Adding Integers → Subtracting Integers → Multiplying Integers → Dividing Integers → Unit Rates → Proportions → Percent Concept → Converting Between Fractions, Decimals, and Percents → Operations with Rational Numbers → Two-Step Equations → Solving Multi-Step Equations → Equations with Variables on Both Sides → Angle Pairs: Complementary, Supplementary, and Vertical → Parallel Lines and Transversals → Corresponding Angles → Alternate Interior Angles → Triangle Angle Sum Theorem → Exterior Angle Theorem → Triangle Inequality Theorem → Similar Triangles: AA Similarity → Similar Triangles: SSS and SAS Similarity → Proportions in Similar Triangles → Right Triangle Trigonometry Introduction → Sine, Cosine, and Tangent Ratios → Trigonometric Ratios Review → Radian Measure → Converting Between Degrees and Radians → The Unit Circle → Graphing Sine and Cosine → Graphing Tangent and Reciprocal Trigonometric Functions → Derivatives of Trigonometric Functions → Antiderivatives → Indefinite Integrals → Basic Integration Rules → Riemann Sums → Definite Integral Definition → Probability Density Functions and Continuous Distributions → Cumulative Distribution Functions → Continuous Random Variables → Probability Density Functions → Expected Value → Weak Law of Large Numbers → Probability Axioms and Rules → Conditional Probability → Conditional Distributions → Bivariate Normal Distribution → Normal Distribution → Standard Normal Distribution and Z-Scores → Hypothesis Testing Fundamentals → Experimental Research Design → Control and Experimental Groups → Random Assignment → Confounding Variables and Internal Validity → Blinding and Demand Characteristics → Validity in Psychological Measurement → Inferential Statistics in Psychology → Effect Size and Statistical Power → Effect Size Reporting and Practical Interpretation → Type I and Type II Error Trade-offs in Decision Making → Multiple Comparisons Problem and Correction Methods → Multiple Comparisons and Type I Error Rate Control

Longest path: 110 steps · 548 total prerequisite topics

Prerequisites (6)

Inferential Statistics in Psychologyhard Conditional Probabilityhard Multiple Comparisons Problem and Correction Methodshard Type I and Type II Error Trade-offs in Decision Makinghard Effect Size and Statistical Powersoft Statistical Conclusion Validity and Assumptions of Statistical Testssoft

Leads To (1)

Exploratory and Confirmatory Analysis Strategies and Their Distinct Rolessoft