A topic in the Open Knowledge Graph — a free, open map of 15,290 topics and the order to learn them in.

Multivariate Normal Distribution

Graduate Depth 100 in the knowledge graph ☐ I know this ☆ Set as goal

80topics build on this

656prerequisites beneath it

Joint Distributions and Marginals (Rigorous)Bivariate Normal Distribution +2 more→→Central Limit Theorem (Rigorous via Characteristic Functions)

Core Idea

A random vector X ~ N(μ, Σ) has characteristic function φ(t) = exp(it'μ - ½t'Σt). The MVN is closed under linear transformations and marginals. A joint distribution is MVN if every linear combination of components is univariate normal. The MVN is fundamental in statistical inference because the sample mean vector is MVN for large samples.

Explainer

You know the univariate normal N(μ, σ²): a bell-shaped distribution centered at μ with spread controlled by σ². The multivariate normal distribution (MVN) extends this to random vectors X = (X₁, ..., Xₙ)'. The cleanest definition: X is MVN if every linear combination a'X = a₁X₁ + ... + aₙXₙ is univariate normal for any fixed vector a. This says the MVN is "normal in every direction" — no matter how you project the joint distribution onto a line, you get a normal curve.

The MVN is parameterized by a mean vector μ ∈ ℝⁿ (where the distribution is centered) and a covariance matrix Σ ∈ ℝⁿˣⁿ (which must be positive semidefinite). The diagonal entries are variances: Σᵢᵢ = Var(Xᵢ). The off-diagonal entries capture correlations: Σᵢⱼ = Cov(Xᵢ, Xⱼ). When Σ is diagonal, the components are independent normals. A positive Σᵢⱼ means Xᵢ and Xⱼ tend to move together; negative means they move in opposite directions.

From your joint distributions work, you know that marginals are obtained by integrating out other variables — often a painful computation. For the MVN, marginals are trivial: if X ~ N(μ, Σ) and you split X into subvectors X = (X₁, X₂)', then X₁ ~ N(μ₁, Σ₁₁) where μ₁ is the corresponding subvector of μ and Σ₁₁ is the corresponding submatrix of Σ. You just read off the relevant pieces. No integration required. This is a major computational advantage of the MVN.

The closure under linear transformations (from your linear transformations prerequisite) is equally powerful: if X ~ N(μ, Σ) and A is a matrix, then AX ~ N(Aμ, AΣA'). This single fact explains why the sample mean X̄ = (1/n)1'X is normal when the data are iid normal — it is a linear transformation of the data vector. More generally, any quantity computed as a linear function of normally distributed data inherits normality. The characteristic function φ(t) = exp(it'μ − ½t'Σt) encodes the entire distribution and makes this closure trivial to prove: φ_{AX}(t) = φ_X(A't), and substituting confirms the form. It also shows the MVN is completely determined by its first two moments — mean and covariance — since all higher cumulants vanish.

Practice Questions 5 questions

Prerequisite Chain

Understanding Zero → The Number Zero → Counting to Five → Counting to 10 → Counting to 20 → Counting a Set of Objects Up to 20 → Cardinality: The Last Number Counted → Matching Numerals to Quantities → Subitizing Small Quantities → Addition Within 10 → Number Bonds to 10 → Addition Within 20 → Doubles and Near Doubles → Doubles Facts Within 10 → Near Doubles Facts Within 20 → Mental Math Strategies for Addition → Mental Math: Adding and Subtracting Tens → Addition Within 100 → Repeated Addition as Multiplication → Multiplication as Equal Groups → Multiplication: Arrays → Basic Multiplication Facts (0s, 1s, 2s, 5s, 10s) → Multiplication Facts Within 100 → Division as Equal Sharing → Division as Grouping (Measurement Division) → Division: Grouping (Repeated Subtraction) Model → Division: Fair Sharing Model → Division as Equal Sharing → Division as Grouping → Basic Division Facts → Division Facts Within 100 → Multiplication and Division Fact Families → Relationship Between Multiplication and Division → Division Facts as Inverse of Multiplication → Remainders and Quotients in Division → Division Word Problems → Multi-Step Word Problems → Solving Multi-Step Word Problems → Multiplication Word Problems → Division Word Problems → Introduction to Long Division → Factors and Multiples → Prime and Composite Numbers → Equivalent Fractions → Relating Fractions and Decimals → Decimal Place Value → Integers and the Number Line → Comparing and Ordering Integers → Absolute Value → Adding Integers → Subtracting Integers → Multiplying Integers → Dividing Integers → Unit Rates → Proportions → Percent Concept → Converting Between Fractions, Decimals, and Percents → Operations with Rational Numbers → Two-Step Equations → Solving Multi-Step Equations → Equations with Variables on Both Sides → Angle Pairs: Complementary, Supplementary, and Vertical → Parallel Lines and Transversals → Corresponding Angles → Alternate Interior Angles → Triangle Angle Sum Theorem → Exterior Angle Theorem → Triangle Inequality Theorem → Similar Triangles: AA Similarity → Similar Triangles: SSS and SAS Similarity → Proportions in Similar Triangles → Right Triangle Trigonometry Introduction → Sine, Cosine, and Tangent Ratios → Trigonometric Ratios Review → Radian Measure → Converting Between Degrees and Radians → The Unit Circle → Graphing Sine and Cosine → Graphing Tangent and Reciprocal Trigonometric Functions → Derivatives of Trigonometric Functions → Antiderivatives → Indefinite Integrals → Basic Integration Rules → Riemann Sums → Definite Integral Definition → Fundamental Theorem of Calculus Part 1 → Fundamental Theorem of Calculus Part 2 → U-Substitution → Partial Fraction Decomposition for Integration → Improper Integrals - Convergence → Integral Test → P-Series → Comparison Test → Limit Comparison Test → Series Convergence Test Strategy → Power Series → Radius and Interval of Convergence → Taylor Series → Moment Generating Functions → Characteristic Functions → Multivariate Normal Distribution

Longest path: 101 steps · 656 total prerequisite topics

Prerequisites (4)

Joint Distributions and Marginals (Rigorous)hard Characteristic Functionssoft Linear Transformationssoft Bivariate Normal Distributionsoft

Leads To (1)

Central Limit Theorem (Rigorous via Characteristic Functions)soft