A topic in the Open Knowledge Graph — a free, open map of 15,290 topics and the order to learn them in.

Bayes' Theorem and Statistical Inference

College Depth 94 in the knowledge graph ☐ I know this ☆ Set as goal

258topics build on this

436prerequisites beneath it

Conditional Probability Law of Total Probability→→Bayesian Methods in Psychometric Modeling Bayesian Networks and Inference +3 more

bayes inference

Core Idea

Bayes' theorem: P(B_i|A)=P(A|B_i)P(B_i)/∑P(A|B_j)P(B_j). It enables updating prior beliefs P(B_i) to posterior beliefs P(B_i|A) given evidence A. This formula is foundational for statistical inference, machine learning, and decision-making under uncertainty.

Explainer

From conditional probability, you know P(B|A) = P(A ∩ B)/P(A), and from the law of total probability, you know how to expand P(A) over a partition. Bayes' theorem combines these two facts into a formula for *inverting* a conditional probability: if you know P(A|B), it tells you how to find P(B|A). The algebra is straightforward — P(A ∩ B) = P(A|B)P(B) = P(B|A)P(A) — but the conceptual shift is profound.

Here is the core intuition with a medical example. Suppose a disease affects 1% of the population, and a test for it is 95% sensitive (correctly identifies sick patients) and 95% specific (correctly identifies healthy patients). You test positive — what is the probability you actually have the disease? Most people guess 95%, but Bayes' theorem gives the right answer. Let B = "you have the disease" and A = "you test positive." Then:

P(B|A) = P(A|B)P(B) / [P(A|B)P(B) + P(A|not-B)P(not-B)]

= (0.95)(0.01) / [(0.95)(0.01) + (0.05)(0.99)]

≈ 0.0095 / 0.0590 ≈ 16%

Even with a highly accurate test, the positive predictive value is only 16% because the disease is rare. The low prior P(B) = 0.01 dominates. This example illustrates the fundamental structure: the prior P(B) encodes your pre-evidence belief; the likelihood P(A|B) encodes how probable the evidence is if the hypothesis is true; and the posterior P(B|A) is what you should believe *after* seeing the evidence.

For statistical inference, the same logic applies with parameters instead of disease states. Suppose θ is a parameter (say, the bias of a coin) and x is observed data (say, 7 heads in 10 flips). Bayes' theorem gives: P(θ|x) ∝ P(x|θ) · P(θ). The posterior distribution over θ is proportional to the likelihood times the prior. This is the foundation of Bayesian statistics: instead of estimating a single point value for θ, you maintain and update an entire probability distribution over θ. Each new observation shifts the posterior, concentrating it around parameter values consistent with the data. The more data you observe, the less the prior matters and the more the likelihood dominates — in the limit, prior and posterior converge to the same answer, making Bayesian and frequentist methods agree asymptotically.

Practice Questions 5 questions

Prerequisite Chain

Understanding Zero → The Number Zero → Counting to Five → Counting to 10 → Counting to 20 → Counting a Set of Objects Up to 20 → Cardinality: The Last Number Counted → Matching Numerals to Quantities → Subitizing Small Quantities → Addition Within 10 → Number Bonds to 10 → Addition Within 20 → Doubles and Near Doubles → Doubles Facts Within 10 → Near Doubles Facts Within 20 → Mental Math Strategies for Addition → Mental Math: Adding and Subtracting Tens → Addition Within 100 → Repeated Addition as Multiplication → Multiplication as Equal Groups → Multiplication: Arrays → Basic Multiplication Facts (0s, 1s, 2s, 5s, 10s) → Multiplication Facts Within 100 → Division as Equal Sharing → Division as Grouping (Measurement Division) → Division: Grouping (Repeated Subtraction) Model → Division: Fair Sharing Model → Division as Equal Sharing → Division as Grouping → Basic Division Facts → Division Facts Within 100 → Multiplication and Division Fact Families → Relationship Between Multiplication and Division → Division Facts as Inverse of Multiplication → Remainders and Quotients in Division → Division Word Problems → Multi-Step Word Problems → Solving Multi-Step Word Problems → Multiplication Word Problems → Division Word Problems → Introduction to Long Division → Factors and Multiples → Prime and Composite Numbers → Equivalent Fractions → Relating Fractions and Decimals → Decimal Place Value → Integers and the Number Line → Comparing and Ordering Integers → Absolute Value → Adding Integers → Subtracting Integers → Multiplying Integers → Dividing Integers → Unit Rates → Proportions → Percent Concept → Converting Between Fractions, Decimals, and Percents → Operations with Rational Numbers → Two-Step Equations → Solving Multi-Step Equations → Equations with Variables on Both Sides → Angle Pairs: Complementary, Supplementary, and Vertical → Parallel Lines and Transversals → Corresponding Angles → Alternate Interior Angles → Triangle Angle Sum Theorem → Exterior Angle Theorem → Triangle Inequality Theorem → Similar Triangles: AA Similarity → Similar Triangles: SSS and SAS Similarity → Proportions in Similar Triangles → Right Triangle Trigonometry Introduction → Sine, Cosine, and Tangent Ratios → Trigonometric Ratios Review → Radian Measure → Converting Between Degrees and Radians → The Unit Circle → Graphing Sine and Cosine → Graphing Tangent and Reciprocal Trigonometric Functions → Derivatives of Trigonometric Functions → Antiderivatives → Indefinite Integrals → Basic Integration Rules → Riemann Sums → Definite Integral Definition → Probability Density Functions and Continuous Distributions → Cumulative Distribution Functions → Continuous Random Variables → Probability Density Functions → Expected Value → Weak Law of Large Numbers → Probability Axioms and Rules → Conditional Probability → Law of Total Probability → Bayes' Theorem and Statistical Inference

Longest path: 95 steps · 436 total prerequisite topics

Prerequisites (2)

Conditional Probabilityhard Law of Total Probabilityhard

Leads To (5)

Bayesian Methods in Psychometric Modelinghard Bayesian Networks and Inferencehard Bayesian Optimizationsoft Bayesian Statistics: Prior, Posterior, Credible Intervalshard Rational Expectations in Macroeconomicssoft