Count Data Regression: Poisson and Negative Binomial Models

Graduate Depth 74 in the knowledge graph I know this Set as goal
Unlocks 2 downstream topics
count-data poisson negative-binomial overdispersion

Core Idea

Count outcomes like protest events or arrests are often overdispersed—more variable than Poisson assumes. Negative binomial regression accommodates overdispersion. Zero-inflated models address excess zeros. Proper model selection improves inference and prevents bias.

Explainer

You already know that ordinary linear regression assumes a continuous, normally distributed outcome. But many social science outcomes are counts — the number of protests in a country in a year, arrests per month, bills introduced in a legislative session, or war casualties in a conflict. Counts are non-negative integers, and their distribution tends to be highly skewed: many observations near zero, a long tail of large values. Fitting OLS to count data can produce nonsensical predictions (negative counts, fractional events) and incorrect standard errors. The solution is a family of regression models built specifically for count data.

The starting point is Poisson regression. The Poisson distribution — which you've studied — has one defining property: its mean equals its variance. Poisson regression models the log of the expected count as a linear function of predictors, which guarantees non-negative predictions and has a natural interpretation: coefficients are log-incident rate ratios, and exponentiated coefficients are multiplicative effects on the expected count. If a coefficient is 0.5, e^0.5 ≈ 1.65, meaning a one-unit increase in that predictor multiplies the expected count by 1.65.

The critical limitation of Poisson is its mean-equals-variance constraint. Real count data is almost always overdispersed — the variance exceeds the mean, often dramatically. This happens when outcomes are clustered (protests cluster in time and space), when unobserved heterogeneity exists across units, or when events follow a contagion process. If you force Poisson on overdispersed data, the model is misspecified: standard errors are underestimated, test statistics are inflated, and you will declare spurious significance. Negative binomial regression relaxes the constraint by adding a dispersion parameter that captures extra-Poisson variation. Think of it as a Poisson model where each observation has its own underlying rate drawn from a gamma distribution — the resulting mixture is the negative binomial. In practice, negative binomial fits are nearly always preferred when overdispersion tests flag a problem.

A further complication is excess zeros — outcomes where the count is zero far more often than Poisson or negative binomial predicts. This arises when two distinct processes generate the data: one process determines whether any events occur at all (a logistic-type "always-zero" mechanism), and a second process governs how many occur when they do (a count mechanism). Zero-inflated Poisson and zero-inflated negative binomial models estimate both processes simultaneously. Model selection between these options typically uses the Vuong test, AIC/BIC comparison, and rootograms (graphical comparison of observed vs. predicted count frequencies) to diagnose where the plain Poisson fails.

What did you take from this?

Topics in reflective domains aren't scored by quiz answers. Read, reflect, and mark when you've thought it through.

Quiz me anyway →

Prerequisite Chain

Counting to 10Counting to 20Understanding ZeroThe Number ZeroCounting to FiveOne-to-One CorrespondenceCombining Small Groups Within 5Addition Within 10Addition Within 20Two-Digit Addition Without RegroupingTwo-Digit Addition with RegroupingAddition Within 100Repeated Addition as MultiplicationMultiplication Facts Within 100Division as Equal SharingDivision as Grouping (Measurement Division)Division: Grouping (Repeated Subtraction) ModelDivision: Fair Sharing ModelDivision as Equal SharingDivision as GroupingBasic Division FactsDivision Facts Within 100Two-Digit by One-Digit DivisionDivision with RemaindersRemainders and Quotients in DivisionDivision Word ProblemsIntroduction to Long DivisionFactors and MultiplesPrime and Composite NumbersEquivalent FractionsRelating Fractions and DecimalsDecimal Place ValueReading and Writing DecimalsComparing and Ordering DecimalsAdding and Subtracting DecimalsMultiplying DecimalsDividing DecimalsDividing FractionsMixed Number ArithmeticOrder of OperationsInteger Order of OperationsVariable ExpressionsCombining Like TermsOne-Step EquationsTwo-Step EquationsSolving Multi-Step EquationsEquations with Variables on Both SidesLiteral EquationsSlope-Intercept FormPoint-Slope FormWriting Linear EquationsParallel and Perpendicular Line SlopesGraphing Linear EquationsPiecewise FunctionsOne-Sided LimitsContinuity DefinitionLimit Definition of the DerivativePower RuleConstant Multiple and Sum/Difference RulesProduct RuleChain RuleHigher-Order DerivativesConcavity and Inflection PointsSecond Derivative TestCurve SketchingOptimization ProblemsCritical Points of Multivariable FunctionsCritical Points and Classification of ExtremaSecond Partial Test for Local Extrema (Hessian)The Hessian Matrix and Second Derivative TestUnconstrained Optimization: Finding ExtremaOptimization in Multiple VariablesLinear Regression for Social ScienceLogistic Regression for Binary OutcomesCount Data Regression: Poisson and Negative Binomial Models

Longest path: 75 steps · 383 total prerequisite topics

Prerequisites (4)

Leads To (2)