A topic in the Open Knowledge Graph — a free, open map of 15,290 topics and the order to learn them in.

Chi-Square Test for Independence

College Depth 100 in the knowledge graph ☐ I know this ☆ Set as goal

486prerequisites beneath it

Chi-Square Distribution: Theory and Tests Hypothesis Testing: Framework and Logic +1 more→

Core Idea

Tests independence of categorical variables. χ²=Σ(Observed−Expected)²/Expected with (rows−1)(cols−1) df. Expected counts computed under independence. Requires all expected counts≥5. Large χ² indicates association.

Explainer

The chi-square test for independence asks a specific question about a contingency table: are two categorical variables statistically independent, or does knowing one variable's category tell you something about the other? For example, does a person's smoking status (yes/no) relate to their disease outcome (sick/well)? Independence — your null hypothesis — has a precise probabilistic meaning from your hypothesis testing framework: P(A and B) = P(A) · P(B) for all categories A and B. The test constructs a statistic that measures how far the observed data deviates from what independence would predict.

The expected counts under independence are computed using a key formula: for a cell in row i and column j of an r × c table, the expected count is E_{ij} = (row i total) × (column j total) / (grand total). This formula follows directly from the independence definition. If smoking and disease are independent, the probability of being a smoking non-sick person should be P(smoking) × P(non-sick) — and multiplying by n gives the expected count. Compare this to the observed count O_{ij} (what you actually see) for every cell. If the two variables are truly independent, observed and expected counts should be close.

The test statistic aggregates these cell-by-cell discrepancies: χ² = Σ (O_{ij} − E_{ij})² / E_{ij}. The denominator E_{ij} standardizes the squared difference — a discrepancy of 5 in a cell with expected count 10 is very different from a discrepancy of 5 in a cell with expected count 1000. Large values of χ² signal systematic association between the variables. Under the null hypothesis of independence, this statistic follows approximately a chi-square distribution (your prerequisite) with (r − 1)(c − 1) degrees of freedom. The degrees of freedom count how many cells are free to vary: once the marginal totals are fixed, specifying (r−1)(c−1) cells determines the entire table.

Two practical requirements matter. First, all expected counts should be at least 5 — below this, the chi-square approximation deteriorates and Fisher's exact test is preferred. Second, the chi-square test detects association but says nothing about its direction or magnitude. A statistically significant result means the pattern of association is unlikely under independence; a large table can have significant chi-square with a very weak practical association. For effect size, pair the test with Cramér's V: V = √(χ² / (n · min(r−1, c−1))), which ranges from 0 (no association) to 1 (perfect association). The test gives the p-value; Cramér's V gives the strength.

Practice Questions 5 questions

Prerequisite Chain

Understanding Zero → The Number Zero → Counting to Five → Counting to 10 → Counting to 20 → Counting a Set of Objects Up to 20 → Cardinality: The Last Number Counted → Matching Numerals to Quantities → Subitizing Small Quantities → Addition Within 10 → Number Bonds to 10 → Addition Within 20 → Doubles and Near Doubles → Doubles Facts Within 10 → Near Doubles Facts Within 20 → Mental Math Strategies for Addition → Mental Math: Adding and Subtracting Tens → Addition Within 100 → Repeated Addition as Multiplication → Multiplication as Equal Groups → Multiplication: Arrays → Basic Multiplication Facts (0s, 1s, 2s, 5s, 10s) → Multiplication Facts Within 100 → Division as Equal Sharing → Division as Grouping (Measurement Division) → Division: Grouping (Repeated Subtraction) Model → Division: Fair Sharing Model → Division as Equal Sharing → Division as Grouping → Basic Division Facts → Division Facts Within 100 → Multiplication and Division Fact Families → Relationship Between Multiplication and Division → Division Facts as Inverse of Multiplication → Remainders and Quotients in Division → Division Word Problems → Multi-Step Word Problems → Solving Multi-Step Word Problems → Multiplication Word Problems → Division Word Problems → Introduction to Long Division → Factors and Multiples → Prime and Composite Numbers → Equivalent Fractions → Relating Fractions and Decimals → Decimal Place Value → Integers and the Number Line → Comparing and Ordering Integers → Absolute Value → Adding Integers → Subtracting Integers → Multiplying Integers → Dividing Integers → Unit Rates → Proportions → Percent Concept → Converting Between Fractions, Decimals, and Percents → Operations with Rational Numbers → Two-Step Equations → Solving Multi-Step Equations → Equations with Variables on Both Sides → Angle Pairs: Complementary, Supplementary, and Vertical → Parallel Lines and Transversals → Corresponding Angles → Alternate Interior Angles → Triangle Angle Sum Theorem → Exterior Angle Theorem → Triangle Inequality Theorem → Similar Triangles: AA Similarity → Similar Triangles: SSS and SAS Similarity → Proportions in Similar Triangles → Right Triangle Trigonometry Introduction → Sine, Cosine, and Tangent Ratios → Trigonometric Ratios Review → Radian Measure → Converting Between Degrees and Radians → The Unit Circle → Graphing Sine and Cosine → Graphing Tangent and Reciprocal Trigonometric Functions → Derivatives of Trigonometric Functions → Antiderivatives → Indefinite Integrals → Basic Integration Rules → Riemann Sums → Definite Integral Definition → Probability Density Functions and Continuous Distributions → Cumulative Distribution Functions → Continuous Random Variables → Probability Density Functions → Expected Value → Weak Law of Large Numbers → Probability Axioms and Rules → Conditional Probability → Independence of Events → Sampling Distributions → Standard Error of Estimators → Hypothesis Testing: Framework and Logic → P-values and Statistical Significance → Effect Size and Practical Significance → Hypothesis Testing: Framework and Logic → Chi-Square Test for Independence

Longest path: 101 steps · 486 total prerequisite topics

Prerequisites (3)

Chi-Square Distribution: Theory and Testshard Hypothesis Testing: Framework and Logichard Independence and Mutually Exclusive Eventssoft

Leads To (0)

No topics depend on this one yet.