Statistical Power, Effect Size, and Sample Size Planning

College Depth 91 in the knowledge graph I know this Set as goal
Unlocks 32 downstream topics
statistical-power effect-size sample-size design-planning

Core Idea

Statistical power is the probability of detecting a true effect. It increases with sample size, effect size magnitude, and alpha level. Effect size quantifies the magnitude of an effect independent of sample size. A-priori power analysis plans sample size to achieve adequate power (typically 0.80). Underpowered studies risk Type II error (missing true effects); overpowered studies waste resources.

How It's Best Learned

Use power analysis software (G*Power) to compute required sample sizes for typical effect sizes and power levels. Review published papers reporting effect sizes and power. Discuss why small-sample studies are common in psychology and their implications.

Common Misconceptions

Explainer

You've already encountered the concept that statistical significance depends on both the size of an effect and the precision of your estimate. Statistical power and effect size formalize this relationship and turn it into a design tool. Power is the probability that your study will detect a true effect when one exists — in other words, the probability of *not* making a Type II error (false negative). Power depends on three things under your control as a researcher: the effect size you're trying to detect, the sample size you collect, and the significance threshold you set.

Effect size is the metric that links statistical results to scientific meaning. It quantifies the magnitude of a difference or relationship in a scale-free way. Common effect size metrics include Cohen's d (for mean differences — a d of 0.5 means the group means are half a standard deviation apart), r (the correlation coefficient, which is its own effect size measure), and η² (proportion of variance explained in ANOVA). Cohen's benchmark guidelines — small (.2), medium (.5), large (.8) for d — are rough calibrations, not laws. What counts as a meaningful effect depends entirely on the domain: a d of 0.2 might be clinically important for a serious disease intervention but trivial for an attitude measure. Effect size connects your result to the world outside the p-value, which is why reporting it is now required by most journals.

A-priori power analysis is the practice of calculating required sample size *before* collecting data, given your target power (typically .80), your chosen alpha (.05), and your expected effect size. The mechanics work like this: power increases as sample size increases, because larger samples reduce sampling error, making it easier to distinguish real effects from noise. If you expect a small effect (d = 0.2), you need a much larger sample to reliably detect it than if you expect a large effect (d = 0.8). Underpowered studies — those with power below .80 — not only fail to detect true effects; they also produce unstable effect size estimates, because small samples vary widely. A study with 30% power that happens to find p < .05 likely observed an inflated effect by chance, which then fails to replicate.

The replication crisis in psychology was partly caused by widespread use of underpowered studies with flexible stopping rules — collecting data until p < .05 emerged. Understanding power helps you see exactly why this is problematic: if you stop when you first cross the significance threshold, you've created an implicit multiple-comparison problem (the more you look, the higher the false positive rate) and you've exploited sampling variability rather than estimated a true effect. The remedy is to commit to a sample size before you start, justify it with a power analysis, and pre-register your hypotheses. Power analysis is not a bureaucratic requirement — it is the tool that connects the precision of your measurement to the scientific claims you're entitled to make.

Practice Questions 5 questions

Prerequisite Chain

Counting to 10Counting to 20Understanding ZeroThe Number ZeroCounting to FiveOne-to-One CorrespondenceCombining Small Groups Within 5Addition Within 10Addition Within 20Two-Digit Addition Without RegroupingTwo-Digit Addition with RegroupingAddition Within 100Repeated Addition as MultiplicationMultiplication Facts Within 100Division as Equal SharingDivision as Grouping (Measurement Division)Division: Grouping (Repeated Subtraction) ModelDivision: Fair Sharing ModelDivision as Equal SharingDivision as GroupingBasic Division FactsDivision Facts Within 100Two-Digit by One-Digit DivisionDivision with RemaindersRemainders and Quotients in DivisionDivision Word ProblemsIntroduction to Long DivisionFactors and MultiplesPrime and Composite NumbersEquivalent FractionsRelating Fractions and DecimalsDecimal Place ValueReading and Writing DecimalsComparing and Ordering DecimalsAdding and Subtracting DecimalsMultiplying DecimalsDividing DecimalsDividing FractionsMixed Number ArithmeticOrder of OperationsInteger Order of OperationsVariable ExpressionsCombining Like TermsOne-Step EquationsTwo-Step EquationsSolving Multi-Step EquationsEquations with Variables on Both SidesAngle Pairs: Complementary, Supplementary, and VerticalParallel Lines and TransversalsCorresponding AnglesAlternate Interior AnglesTriangle Angle Sum TheoremExterior Angle TheoremTriangle Inequality TheoremSimilar Triangles: AA SimilaritySimilar Triangles: SSS and SAS SimilarityProportions in Similar TrianglesRight Triangle Trigonometry IntroductionTrigonometric Ratios ReviewRadian MeasureConverting Between Degrees and RadiansThe Unit CircleGraphing Sine and CosineGraphing Tangent and Reciprocal Trigonometric FunctionsDerivatives of Trigonometric FunctionsAntiderivativesIndefinite IntegralsBasic Integration RulesRiemann SumsDefinite Integral DefinitionProbability Density Functions and Continuous DistributionsCumulative Distribution FunctionsContinuous Random VariablesNormal DistributionCentral Limit TheoremConfidence Intervals for MeansZ-Tests and T-Tests for MeansOne-Sample Z-Test for MeansOne-Sample and Two-Sample T-TestsInferential Statistics in PsychologyEffect Size and Statistical PowerSample Size Determination in Research PlanningLiterature Review and Research SynthesisHypothesis Construction: Directional and Nondirectional PredictionsOperationalizing Independent and Dependent VariablesConstruct Definition and Measurement DevelopmentConstruct Validity and Measurement ValidityConstruct Validity and Operationalization of Psychological ConstructsVariables: Definition, Operationalization, and MeasurementSelecting and Matching Research Designs to QuestionsPopulations, Sampling Methods, and RepresentativenessStatistical Power, Effect Size, and Sample Size Planning

Longest path: 92 steps · 434 total prerequisite topics

Prerequisites (7)

Leads To (1)