← Graph View All Domains

A topic in the Open Knowledge Graph — a free, open map of 15,290 topics and the order to learn them in.

Generalized Method of Moments (GMM)

Graduate Depth 119 in the knowledge graph ☐ I know this ☆ Set as goal

12topics build on this

660prerequisites beneath it

See this on the map →

Instrumental Variables Linear Transformations +3 more→→Dynamic Panel Models: Arellano-Bond Estimator

Core Idea

GMM exploits moment conditions E[f(Yᵢ, θ)] = 0 to estimate θ by minimizing a quadratic form in sample moments. It generalizes OLS, IV, and MLE; yields efficient estimators when moment conditions are correctly specified. The Hansen J-test checks overidentification.

Explainer

You've encountered several estimation strategies already: OLS minimizes squared residuals, MLE maximizes the likelihood of the observed data, and IV uses instruments to isolate exogenous variation. GMM unifies all of these into a single framework built around the idea of moment conditions. A moment condition is a population statement of the form E[f(Yᵢ, θ)] = 0, where f is some function of the data and the parameters, and the expectation equals zero when evaluated at the true θ. OLS, for example, rests on the moment condition E[Xᵢ(Yᵢ - Xᵢ'β)] = 0 — the orthogonality of regressors and errors. IV adds the instrument orthogonality condition E[Zᵢ(Yᵢ - Xᵢ'β)] = 0. Both are special cases of the GMM framework.

The GMM estimator works by replacing the population expectation E[f(Yᵢ, θ)] with its sample analog (1/n)Σf(Yᵢ, θ), then choosing θ to make this sample moment vector as close to zero as possible. When you have exactly as many moment conditions as parameters — just-identified — you can set the sample moments exactly to zero and solve directly. This gives the IV estimator as a special case. When you have more moment conditions than parameters — overidentified — you can't satisfy all moments simultaneously, so you minimize a weighted sum of squared moments: the GMM objective function g(θ)'Wg(θ), where g(θ) is the vector of sample moments and W is a weighting matrix.

The choice of W matters enormously for efficiency. The optimal weighting matrix is the inverse of the variance of the moment conditions — intuitively, you should downweight moments that are noisy and upweight those that are precisely estimated. Implementing this requires two-step GMM: estimate θ with an initial W (often the identity matrix), compute the sample variance of the moments at those estimates, invert it to get the optimal W, and re-estimate. The resulting two-step GMM estimator is asymptotically efficient among all GMM estimators using those moment conditions.

Overidentification creates a testable restriction: if the model is correctly specified, all the moment conditions should hold simultaneously. The Hansen J-statistic measures how well the overidentifying restrictions are satisfied at the GMM estimates. A large J-statistic — relative to a chi-squared distribution with degrees of freedom equal to the number of overidentifying restrictions — suggests at least one moment condition is misspecified, meaning some instruments may be invalid or the functional form is wrong. Passing the J-test is necessary but not sufficient for validity; failing it is a clear signal of misspecification. In practice, GMM is particularly useful in rational expectations models (where theory delivers moment conditions directly) and in dynamic panel models where the Arellano-Bond estimator uses lagged levels as instruments for differenced equations.

Practice Questions 5 questions

Prerequisite Chain

Understanding Zero → The Number Zero → Counting to Five → Counting to 10 → Counting to 20 → Counting a Set of Objects Up to 20 → Cardinality: The Last Number Counted → Matching Numerals to Quantities → Subitizing Small Quantities → Addition Within 10 → Number Bonds to 10 → Addition Within 20 → Doubles and Near Doubles → Doubles Facts Within 10 → Near Doubles Facts Within 20 → Mental Math Strategies for Addition → Mental Math: Adding and Subtracting Tens → Addition Within 100 → Repeated Addition as Multiplication → Multiplication as Equal Groups → Multiplication: Arrays → Basic Multiplication Facts (0s, 1s, 2s, 5s, 10s) → Multiplication Facts Within 100 → Division as Equal Sharing → Division as Grouping (Measurement Division) → Division: Grouping (Repeated Subtraction) Model → Division: Fair Sharing Model → Division as Equal Sharing → Division as Grouping → Basic Division Facts → Division Facts Within 100 → Multiplication and Division Fact Families → Relationship Between Multiplication and Division → Division Facts as Inverse of Multiplication → Remainders and Quotients in Division → Division Word Problems → Multi-Step Word Problems → Solving Multi-Step Word Problems → Multiplication Word Problems → Division Word Problems → Introduction to Long Division → Factors and Multiples → Prime and Composite Numbers → Equivalent Fractions → Relating Fractions and Decimals → Decimal Place Value → Integers and the Number Line → Comparing and Ordering Integers → Absolute Value → Adding Integers → Subtracting Integers → Multiplying Integers → Dividing Integers → Unit Rates → Proportions → Percent Concept → Converting Between Fractions, Decimals, and Percents → Operations with Rational Numbers → Two-Step Equations → Solving Multi-Step Equations → Equations with Variables on Both Sides → Angle Pairs: Complementary, Supplementary, and Vertical → Parallel Lines and Transversals → Corresponding Angles → Alternate Interior Angles → Triangle Angle Sum Theorem → Exterior Angle Theorem → Triangle Inequality Theorem → Similar Triangles: AA Similarity → Similar Triangles: SSS and SAS Similarity → Proportions in Similar Triangles → Right Triangle Trigonometry Introduction → Sine, Cosine, and Tangent Ratios → Trigonometric Ratios Review → Radian Measure → Converting Between Degrees and Radians → The Unit Circle → Graphing Sine and Cosine → Graphing Tangent and Reciprocal Trigonometric Functions → Derivatives of Trigonometric Functions → Antiderivatives → Indefinite Integrals → Basic Integration Rules → Riemann Sums → Definite Integral Definition → Probability Density Functions and Continuous Distributions → Cumulative Distribution Functions → Continuous Random Variables → Probability Density Functions → Expected Value → Weak Law of Large Numbers → Probability Axioms and Rules → Conditional Probability → Independence of Events → Sampling Distributions → Standard Error of Estimators → Hypothesis Testing: Framework and Logic → P-values and Statistical Significance → Effect Size and Practical Significance → Hypothesis Testing: Framework and Logic → Z-Tests and T-Tests for Means → One-Sample Z-Test for Means → One-Sample and Two-Sample T-Tests → Inference in Linear Regression → Prediction Intervals in Regression → Linear Regression Basics → Residuals and Goodness of Fit (R²) → Simple (Bivariate) OLS Regression → Classical OLS Assumptions (Gauss-Markov) → Multiple Regression → Interpreting Regression Coefficients → Hypothesis Testing in Regression → F-Test and Joint Significance → R-Squared and Model Fit → Omitted Variable Bias → Causal Inference and the Identification Problem → Potential Outcomes and the Rubin Causal Model → Selection Bias → Instrumental Variables → Generalized Method of Moments (GMM)

Longest path: 120 steps · 660 total prerequisite topics

Prerequisites (5)

Instrumental Variableshard Probability Axiomshard Linear Transformationshard Maximum Likelihood Estimationsoft Expected Value: Theory and Propertiessoft

Leads To (1)

Dynamic Panel Models: Arellano-Bond Estimatorhard