Count Data Models: Poisson and Negative Binomial Regression

Graduate Depth 76 in the knowledge graph I know this Set as goal
count-data poisson negative-binomial

Core Idea

Poisson regression models count outcomes by linking the conditional mean to covariates, with the constraint that mean equals variance. Negative binomial relaxes this restriction, allowing overdispersion when variance exceeds the mean.

Explainer

Your prerequisite — maximum likelihood estimation — gives you the machinery to fit models where assumptions about the error distribution can be made explicit. Now consider a type of outcome that violates every OLS assumption: counts. How many hospital visits did a patient have last year? How many patents did a firm file? These outcomes are non-negative integers, they cluster near zero, and their variance tends to grow with the mean. Applying OLS to such data produces nonsensical predictions (including negative counts) and invalid standard errors.

Poisson regression is the natural starting point. It assumes the outcome Y follows a Poisson distribution with conditional mean λ = exp(Xβ). The exponential link ensures predicted counts are always non-negative — a necessary constraint. You can read the coefficients as effects on log(λ): a one-unit increase in x multiplies the expected count by exp(β). This is the count-data analog of the log-linear interpretation you may have seen in OLS with logged outcomes. Estimation proceeds by maximizing the Poisson log-likelihood, which you already know how to do.

The Poisson model imposes one distinctive restriction: the mean equals the variance (equidispersion). In practice, count data is almost always overdispersed — the observed variance exceeds the Poisson mean. Think of emergency room visits: most people have zero or one visit per year, but a small, chronically ill population has very many, inflating the variance far above the mean. If you fit Poisson to overdispersed data, the standard errors are too small and t-statistics are inflated, leading to false significance.

Negative binomial regression relaxes equidispersion by introducing an extra dispersion parameter α. When α = 0, the negative binomial collapses to Poisson — you can formally test this restriction. The NB model can be derived by treating each observation as drawn from a Poisson distribution whose own mean varies across individuals according to a gamma distribution. The intuition is that individuals have unobserved heterogeneity in their base rate of the count outcome, and this unobserved variation inflates the variance. In practice, testing whether the negative binomial significantly improves on Poisson is one of the first diagnostics to run on any count dataset.

A further extension worth knowing is the zero-inflated count model, which handles data with far more zeros than any Poisson or negative binomial distribution can accommodate. This arises when zeros come from two distinct processes — for example, lifelong non-smokers who can never have a smoking-related diagnosis, versus smokers who happen to have zero incidents this period. Zero-inflated models combine a binary component (is the outcome structurally zero?) with a count component (given non-zero, how many?), letting each process have its own covariates.

Practice Questions 5 questions

Prerequisite Chain

Counting to 10Counting to 20Understanding ZeroThe Number ZeroCounting to FiveOne-to-One CorrespondenceCombining Small Groups Within 5Addition Within 10Addition Within 20Two-Digit Addition Without RegroupingTwo-Digit Addition with RegroupingAddition Within 100Repeated Addition as MultiplicationMultiplication Facts Within 100Division as Equal SharingDivision as Grouping (Measurement Division)Division: Grouping (Repeated Subtraction) ModelDivision: Fair Sharing ModelDivision as Equal SharingDivision as GroupingBasic Division FactsDivision Facts Within 100Two-Digit by One-Digit DivisionDivision with RemaindersRemainders and Quotients in DivisionDivision Word ProblemsIntroduction to Long DivisionFactors and MultiplesPrime and Composite NumbersEquivalent FractionsRelating Fractions and DecimalsDecimal Place ValueReading and Writing DecimalsComparing and Ordering DecimalsAdding and Subtracting DecimalsMultiplying DecimalsDividing DecimalsDividing FractionsMixed Number ArithmeticOrder of OperationsInteger Order of OperationsVariable ExpressionsCombining Like TermsOne-Step EquationsTwo-Step EquationsSolving Multi-Step EquationsEquations with Variables on Both SidesAngle Pairs: Complementary, Supplementary, and VerticalParallel Lines and TransversalsCorresponding AnglesAlternate Interior AnglesTriangle Angle Sum TheoremExterior Angle TheoremTriangle Inequality TheoremSimilar Triangles: AA SimilaritySimilar Triangles: SSS and SAS SimilarityProportions in Similar TrianglesRight Triangle Trigonometry IntroductionTrigonometric Ratios ReviewRadian MeasureConverting Between Degrees and RadiansThe Unit CircleGraphing Sine and CosineGraphing Tangent and Reciprocal Trigonometric FunctionsDerivatives of Trigonometric FunctionsAntiderivativesIndefinite IntegralsBasic Integration RulesRiemann SumsDefinite Integral DefinitionProbability Density Functions and Continuous DistributionsCumulative Distribution FunctionsContinuous Random VariablesNormal DistributionClassical OLS Assumptions (Gauss-Markov)Maximum Likelihood EstimationCount Data Models: Poisson and Negative Binomial Regression

Longest path: 77 steps · 455 total prerequisite topics

Prerequisites (1)

Leads To (0)

No topics depend on this one yet.