Generalized Least Squares (GLS) for Non-Spherical Errors

College Depth 82 in the knowledge graph I know this Set as goal
Unlocks 14 downstream topics
estimation heteroskedasticity gls

Core Idea

GLS transforms the regression by the inverse of the error variance-covariance matrix, restoring efficiency when errors are heteroskedastic or serially correlated. When the covariance structure is known, GLS recovers BLUE properties; when unknown and must be estimated from residuals, the procedure is feasible GLS (FGLS).

Explainer

You know from the OLS assumptions that the Gauss-Markov theorem requires spherical errors: residuals that are homoskedastic (constant variance) and uncorrelated with each other. When these conditions fail — because errors are heteroskedastic or serially correlated — OLS is no longer the Best Linear Unbiased Estimator. It is still unbiased, but it is inefficient: some other linear estimator uses the data better. GLS is that better estimator.

The core idea is a transformation. Suppose the error variance-covariance matrix is Ω rather than σ²I. OLS minimizes the sum of squared residuals, treating each observation equally. But if some observations have much higher variance than others, they are noisier signals about the true relationship — they should count for less. GLS formalizes this: it pre-multiplies the regression equation by Ω^(-1/2) (the inverse of the Cholesky factor of Ω), which rescales observations by the inverse of their error standard deviation. Observations with high variance get down-weighted; observations with low variance get up-weighted. This transformation restores spherical errors in the new equation, so OLS applied to the transformed data is BLUE.

In matrix terms: the GLS estimator is β̂_GLS = (X'Ω⁻¹X)⁻¹X'Ω⁻¹y. Notice how this collapses to OLS when Ω = σ²I: you recover the standard formula (X'X)⁻¹X'y. The generalization is a weighted least squares procedure when Ω is diagonal (only variances differ across observations), or a correlated-errors transformation when Ω has off-diagonal terms (serial correlation). For the serial correlation case, the Prais-Winsten or Cochrane-Orcutt procedures implement GLS by first estimating the autocorrelation parameter ρ and then applying the transformation that removes it.

The practical complication is that Ω is almost never known in advance. You must estimate it from OLS residuals, giving Feasible GLS (FGLS). This two-step procedure is consistent but no longer exactly BLUE in finite samples — you've introduced estimation error from the first step. FGLS is often contrasted with the alternative of just using OLS with robust standard errors (Huber-White for heteroskedasticity, Newey-West for serial correlation): robust standard errors leave the point estimates alone but correct the inference, while FGLS changes both estimates and standard errors. For large samples the two approaches often give similar results, but FGLS can be more efficient; for small samples, robust standard errors are frequently preferred for their weaker assumptions.

Practice Questions 5 questions

Prerequisite Chain

Counting to 10Counting to 20Understanding ZeroThe Number ZeroCounting to FiveOne-to-One CorrespondenceCombining Small Groups Within 5Addition Within 10Addition Within 20Two-Digit Addition Without RegroupingTwo-Digit Addition with RegroupingAddition Within 100Repeated Addition as MultiplicationMultiplication Facts Within 100Division as Equal SharingDivision as Grouping (Measurement Division)Division: Grouping (Repeated Subtraction) ModelDivision: Fair Sharing ModelDivision as Equal SharingDivision as GroupingBasic Division FactsDivision Facts Within 100Two-Digit by One-Digit DivisionDivision with RemaindersRemainders and Quotients in DivisionDivision Word ProblemsIntroduction to Long DivisionFactors and MultiplesPrime and Composite NumbersEquivalent FractionsRelating Fractions and DecimalsDecimal Place ValueReading and Writing DecimalsComparing and Ordering DecimalsAdding and Subtracting DecimalsMultiplying DecimalsDividing DecimalsDividing FractionsMixed Number ArithmeticOrder of OperationsInteger Order of OperationsVariable ExpressionsCombining Like TermsOne-Step EquationsTwo-Step EquationsSolving Multi-Step EquationsEquations with Variables on Both SidesAngle Pairs: Complementary, Supplementary, and VerticalParallel Lines and TransversalsCorresponding AnglesAlternate Interior AnglesTriangle Angle Sum TheoremExterior Angle TheoremTriangle Inequality TheoremSimilar Triangles: AA SimilaritySimilar Triangles: SSS and SAS SimilarityProportions in Similar TrianglesRight Triangle Trigonometry IntroductionTrigonometric Ratios ReviewRadian MeasureConverting Between Degrees and RadiansThe Unit CircleGraphing Sine and CosineGraphing Tangent and Reciprocal Trigonometric FunctionsDerivatives of Trigonometric FunctionsAntiderivativesIndefinite IntegralsBasic Integration RulesRiemann SumsDefinite Integral DefinitionProbability Density Functions and Continuous DistributionsCumulative Distribution FunctionsContinuous Random VariablesNormal DistributionCentral Limit TheoremConfidence Intervals for MeansZ-Tests and T-Tests for MeansOne-Sample Z-Test for MeansOne-Sample and Two-Sample T-TestsHypothesis Testing in RegressionSpecification Error: RESET TestWhite Test and Detection of HeteroskedasticityGeneralized Least Squares (GLS) for Non-Spherical Errors

Longest path: 83 steps · 425 total prerequisite topics

Prerequisites (4)

Leads To (2)