A topic in the Open Knowledge Graph — a free, open map of 15,290 topics and the order to learn them in.

Chi-Square Test

College Depth 100 in the knowledge graph ☐ I know this ☆ Set as goal

1,097topics build on this

511prerequisites beneath it

Chi-Square Distribution: Theory and Tests Hypothesis Testing Fundamentals +2 more→→Chi-Square Analysis in Genetic Data Error Analysis and Statistics in Analytical Chemistry +3 more

Core Idea

The chi-square test assesses whether observed frequencies in categories differ significantly from expected frequencies under a null hypothesis. For a goodness-of-fit test, it compares observed category frequencies to theoretical (expected) frequencies. For a test of independence, it tests whether two categorical variables are independent in a contingency table. The test statistic is χ² = Σ(Observed - Expected)²/Expected, which follows a chi-square distribution when the null hypothesis is true and expected frequencies are sufficiently large (typically ≥ 5).

How It's Best Learned

Set up null hypotheses for goodness-of-fit scenarios (coin fairness, six-sided die). Create contingency tables and test independence. Verify that expected frequencies meet assumptions.

Common Misconceptions

Using chi-square with expected frequencies < 5. Confusing goodness-of-fit with independence tests. Forgetting that small p-values indicate deviation from the null, not confirmation of hypotheses. Thinking chi-square tests directionality (they don't).

Explainer

From hypothesis testing, you know the general structure: state H₀, compute a test statistic designed to be large when H₀ is wrong, compare to a null distribution, and reject if the result is unlikely under H₀. The chi-square test applies this structure to categorical data — outcomes that fall into labeled buckets rather than on a numerical scale. The test statistic χ² = Σ(O − E)²/E accumulates evidence by comparing observed counts O to expected counts E in each category. Each term (O−E)²/E is zero when observations match expectations perfectly and grows as the discrepancy increases. The total χ² measures the overall gap between what you saw and what H₀ predicts.

The goodness-of-fit test asks whether your data came from a specific distribution. Example: you roll a six-sided die 120 times. Under H₀ (fair die), you expect E = 20 for each face. If your observed counts are 15, 22, 18, 25, 17, 23, compute χ² = (15−20)²/20 + (22−20)²/20 + ... for all six faces. The degrees of freedom are k−1 = 5 (you lose one degree of freedom because the counts must sum to 120). Compare χ² to a chi-square distribution with 5 degrees of freedom. A large value means the die is likely unfair; a small value means the data is consistent with fairness.

The test of independence asks whether two categorical variables are related. Suppose you survey 200 people and record gender (M/F) and preference (Product A/B/C). You arrange data in a 2×3 contingency table. Under H₀ (independence), the expected count in each cell is (row total × column total)/grand total — the count you would expect if gender and preference had nothing to do with each other. Compute χ² summing (O−E)²/E over all 6 cells, with degrees of freedom (r−1)(c−1) = (2−1)(3−1) = 2. The same test statistic, different null hypothesis and degrees of freedom.

One critical assumption underlies both tests: expected counts in every cell must be at least 5. When expected counts are small, the chi-square approximation to the null distribution breaks down, p-values become unreliable, and you need alternatives such as Fisher's exact test. Also note that chi-square tests are always one-tailed (you only reject for large χ²) and do not indicate *direction* of association — they detect that a difference exists, but not which categories deviate most. For that, examine the individual (O−E)²/E terms after rejecting H₀.

Practice Questions 5 questions

Prerequisite Chain

Understanding Zero → The Number Zero → Counting to Five → Counting to 10 → Counting to 20 → Counting a Set of Objects Up to 20 → Cardinality: The Last Number Counted → Matching Numerals to Quantities → Subitizing Small Quantities → Addition Within 10 → Number Bonds to 10 → Addition Within 20 → Doubles and Near Doubles → Doubles Facts Within 10 → Near Doubles Facts Within 20 → Mental Math Strategies for Addition → Mental Math: Adding and Subtracting Tens → Addition Within 100 → Repeated Addition as Multiplication → Multiplication as Equal Groups → Multiplication: Arrays → Basic Multiplication Facts (0s, 1s, 2s, 5s, 10s) → Multiplication Facts Within 100 → Division as Equal Sharing → Division as Grouping (Measurement Division) → Division: Grouping (Repeated Subtraction) Model → Division: Fair Sharing Model → Division as Equal Sharing → Division as Grouping → Basic Division Facts → Division Facts Within 100 → Multiplication and Division Fact Families → Relationship Between Multiplication and Division → Division Facts as Inverse of Multiplication → Remainders and Quotients in Division → Division Word Problems → Multi-Step Word Problems → Solving Multi-Step Word Problems → Multiplication Word Problems → Division Word Problems → Introduction to Long Division → Factors and Multiples → Prime and Composite Numbers → Equivalent Fractions → Relating Fractions and Decimals → Decimal Place Value → Integers and the Number Line → Comparing and Ordering Integers → Absolute Value → Adding Integers → Subtracting Integers → Multiplying Integers → Dividing Integers → Unit Rates → Proportions → Percent Concept → Converting Between Fractions, Decimals, and Percents → Operations with Rational Numbers → Two-Step Equations → Solving Multi-Step Equations → Equations with Variables on Both Sides → Angle Pairs: Complementary, Supplementary, and Vertical → Parallel Lines and Transversals → Corresponding Angles → Alternate Interior Angles → Triangle Angle Sum Theorem → Exterior Angle Theorem → Triangle Inequality Theorem → Similar Triangles: AA Similarity → Similar Triangles: SSS and SAS Similarity → Proportions in Similar Triangles → Right Triangle Trigonometry Introduction → Sine, Cosine, and Tangent Ratios → Trigonometric Ratios Review → Radian Measure → Converting Between Degrees and Radians → The Unit Circle → Graphing Sine and Cosine → Graphing Tangent and Reciprocal Trigonometric Functions → Derivatives of Trigonometric Functions → Antiderivatives → Indefinite Integrals → Basic Integration Rules → Riemann Sums → Definite Integral Definition → Probability Density Functions and Continuous Distributions → Cumulative Distribution Functions → Continuous Random Variables → Probability Density Functions → Expected Value → Weak Law of Large Numbers → Probability Axioms and Rules → Conditional Probability → Independence of Events → Sampling Distributions → Standard Error of Estimators → Hypothesis Testing: Framework and Logic → P-values and Statistical Significance → Effect Size and Practical Significance → Hypothesis Testing: Framework and Logic → Chi-Square Test

Longest path: 101 steps · 511 total prerequisite topics

Prerequisites (4)

Hypothesis Testing Fundamentalshard Chi-Square Distribution: Theory and Testshard Hypothesis Testing: Framework and Logichard Frequency Distributions and Contingency Tablessoft

Leads To (5)

Chi-Square Analysis in Genetic Datahard Error Analysis and Statistics in Analytical Chemistrysoft Genetic Mapping and Linkagesoft IRT Model Comparison and Fit Evaluationsoft Mendelian Geneticssoft