A topic in the Open Knowledge Graph — a free, open map of 15,290 topics and the order to learn them in.

Generalized Least Squares (GLS) for Non-Spherical Errors

College Depth 114 in the knowledge graph ☐ I know this ☆ Set as goal

16topics build on this

587prerequisites beneath it

Classical OLS Assumptions (Gauss-Markov)Linear Transformations +2 more→→Feasible GLS (FGLS) with Estimated Covariance Structure Weighted Least Squares (WLS)

Core Idea

GLS transforms the regression by the inverse of the error variance-covariance matrix, restoring efficiency when errors are heteroskedastic or serially correlated. When the covariance structure is known, GLS recovers BLUE properties; when unknown and must be estimated from residuals, the procedure is feasible GLS (FGLS).

Explainer

You know from the OLS assumptions that the Gauss-Markov theorem requires spherical errors: residuals that are homoskedastic (constant variance) and uncorrelated with each other. When these conditions fail — because errors are heteroskedastic or serially correlated — OLS is no longer the Best Linear Unbiased Estimator. It is still unbiased, but it is inefficient: some other linear estimator uses the data better. GLS is that better estimator.

The core idea is a transformation. Suppose the error variance-covariance matrix is Ω rather than σ²I. OLS minimizes the sum of squared residuals, treating each observation equally. But if some observations have much higher variance than others, they are noisier signals about the true relationship — they should count for less. GLS formalizes this: it pre-multiplies the regression equation by Ω^(-1/2) (the inverse of the Cholesky factor of Ω), which rescales observations by the inverse of their error standard deviation. Observations with high variance get down-weighted; observations with low variance get up-weighted. This transformation restores spherical errors in the new equation, so OLS applied to the transformed data is BLUE.

In matrix terms: the GLS estimator is β̂_GLS = (X'Ω⁻¹X)⁻¹X'Ω⁻¹y. Notice how this collapses to OLS when Ω = σ²I: you recover the standard formula (X'X)⁻¹X'y. The generalization is a weighted least squares procedure when Ω is diagonal (only variances differ across observations), or a correlated-errors transformation when Ω has off-diagonal terms (serial correlation). For the serial correlation case, the Prais-Winsten or Cochrane-Orcutt procedures implement GLS by first estimating the autocorrelation parameter ρ and then applying the transformation that removes it.

The practical complication is that Ω is almost never known in advance. You must estimate it from OLS residuals, giving Feasible GLS (FGLS). This two-step procedure is consistent but no longer exactly BLUE in finite samples — you've introduced estimation error from the first step. FGLS is often contrasted with the alternative of just using OLS with robust standard errors (Huber-White for heteroskedasticity, Newey-West for serial correlation): robust standard errors leave the point estimates alone but correct the inference, while FGLS changes both estimates and standard errors. For large samples the two approaches often give similar results, but FGLS can be more efficient; for small samples, robust standard errors are frequently preferred for their weaker assumptions.

Practice Questions 5 questions

Prerequisite Chain

Understanding Zero → The Number Zero → Counting to Five → Counting to 10 → Counting to 20 → Counting a Set of Objects Up to 20 → Cardinality: The Last Number Counted → Matching Numerals to Quantities → Subitizing Small Quantities → Addition Within 10 → Number Bonds to 10 → Addition Within 20 → Doubles and Near Doubles → Doubles Facts Within 10 → Near Doubles Facts Within 20 → Mental Math Strategies for Addition → Mental Math: Adding and Subtracting Tens → Addition Within 100 → Repeated Addition as Multiplication → Multiplication as Equal Groups → Multiplication: Arrays → Basic Multiplication Facts (0s, 1s, 2s, 5s, 10s) → Multiplication Facts Within 100 → Division as Equal Sharing → Division as Grouping (Measurement Division) → Division: Grouping (Repeated Subtraction) Model → Division: Fair Sharing Model → Division as Equal Sharing → Division as Grouping → Basic Division Facts → Division Facts Within 100 → Multiplication and Division Fact Families → Relationship Between Multiplication and Division → Division Facts as Inverse of Multiplication → Remainders and Quotients in Division → Division Word Problems → Multi-Step Word Problems → Solving Multi-Step Word Problems → Multiplication Word Problems → Division Word Problems → Introduction to Long Division → Factors and Multiples → Prime and Composite Numbers → Equivalent Fractions → Relating Fractions and Decimals → Decimal Place Value → Integers and the Number Line → Comparing and Ordering Integers → Absolute Value → Adding Integers → Subtracting Integers → Multiplying Integers → Dividing Integers → Unit Rates → Proportions → Percent Concept → Converting Between Fractions, Decimals, and Percents → Operations with Rational Numbers → Two-Step Equations → Solving Multi-Step Equations → Equations with Variables on Both Sides → Angle Pairs: Complementary, Supplementary, and Vertical → Parallel Lines and Transversals → Corresponding Angles → Alternate Interior Angles → Triangle Angle Sum Theorem → Exterior Angle Theorem → Triangle Inequality Theorem → Similar Triangles: AA Similarity → Similar Triangles: SSS and SAS Similarity → Proportions in Similar Triangles → Right Triangle Trigonometry Introduction → Sine, Cosine, and Tangent Ratios → Trigonometric Ratios Review → Radian Measure → Converting Between Degrees and Radians → The Unit Circle → Graphing Sine and Cosine → Graphing Tangent and Reciprocal Trigonometric Functions → Derivatives of Trigonometric Functions → Antiderivatives → Indefinite Integrals → Basic Integration Rules → Riemann Sums → Definite Integral Definition → Probability Density Functions and Continuous Distributions → Cumulative Distribution Functions → Continuous Random Variables → Probability Density Functions → Expected Value → Weak Law of Large Numbers → Probability Axioms and Rules → Conditional Probability → Independence of Events → Sampling Distributions → Standard Error of Estimators → Hypothesis Testing: Framework and Logic → P-values and Statistical Significance → Effect Size and Practical Significance → Hypothesis Testing: Framework and Logic → Z-Tests and T-Tests for Means → One-Sample Z-Test for Means → One-Sample and Two-Sample T-Tests → Inference in Linear Regression → Prediction Intervals in Regression → Linear Regression Basics → Residuals and Goodness of Fit (R²) → Simple (Bivariate) OLS Regression → Classical OLS Assumptions (Gauss-Markov) → Multiple Regression → Interpreting Regression Coefficients → Hypothesis Testing in Regression → F-Test and Joint Significance → White Test and Detection of Heteroskedasticity → Generalized Least Squares (GLS) for Non-Spherical Errors

Longest path: 115 steps · 587 total prerequisite topics

Prerequisites (4)

White Test and Detection of Heteroskedasticityhard Classical OLS Assumptions (Gauss-Markov)hard Linear Transformationshard Matrix Operationshard

Leads To (2)

Feasible GLS (FGLS) with Estimated Covariance Structurehard Weighted Least Squares (WLS)hard