A data scientist models customer support tickets per hour using Poisson(λ = 10). After collecting data, the sample mean is 10 but the sample variance is 35. What does this indicate?
AThe Poisson model may be inadequate — the data shows overdispersion
BThe model is fine; variance slightly exceeding the mean is normal sampling variation
CA larger sample would resolve the discrepancy
DThe model is correct; variance should exceed the mean in real data
A key property of the Poisson distribution is mean = variance = λ. When observed variance substantially exceeds the mean (35 >> 10), this is overdispersion, which often signals that events are not independent (e.g., one ticket triggers others, or load is bursty). A negative binomial model handles overdispersion better. Option B is wrong — a variance 3.5× the mean is not sampling noise; it indicates systematic departure from Poisson assumptions.
Question 2 Multiple Choice
Which of the following situations best fits a Poisson model?
ANumber of emails arriving at a server per minute at constant, steady traffic
BNumber of heads in 100 coin flips
CNumber of students who pass an exam out of 30 enrolled
DNumber of goals scored by a team, given they score more often after the first goal
Poisson models count events at a constant rate, independently, in a fixed interval. Option A fits: independent arrivals, constant rate, countable events per minute. Options B and C are binomial (fixed n trials, success/failure). Option D violates independence — clustering after the first goal means events are not independent, which would produce overdispersion.
Question 3 True / False
For a Poisson random variable with parameter λ = 4, the variance equals 2 (the square root of the mean).
TTrue
FFalse
Answer: False
For any Poisson distribution, variance = λ, not √λ. With λ = 4, the variance is 4. The standard deviation is √λ = 2, but variance and standard deviation are different things. The equal-mean-and-variance property (both = λ) is a diagnostic signature of the Poisson distribution.
Question 4 True / False
The Poisson distribution arises as a limit of the binomial when n is large and p is small, with λ = np held constant.
TTrue
FFalse
Answer: True
Divide a fixed interval into n tiny sub-intervals, each with probability p = λ/n of an event. The total count follows Binomial(n, p). As n → ∞ with np = λ fixed, this converges to Poisson(λ). This derivation reveals the three Poisson conditions: many independent opportunities, each with small probability, at a fixed average rate.
Question 5 Short Answer
What does it mean for count data to be 'overdispersed,' and why does overdispersion suggest the Poisson model is inappropriate?
Think about your answer, then reveal below.
Model answer: Overdispersion means the sample variance substantially exceeds the sample mean. The Poisson distribution requires variance = mean = λ, which follows from the independence assumption — events don't cluster. When overdispersion occurs, it usually signals dependence: one event makes another more likely (aftershocks, contagious disease cases). Using Poisson anyway underestimates variability, making confidence intervals and predictions too narrow. A negative binomial model is typically used instead, as it adds a parameter allowing variance > mean.
The mean = variance property is not just a curiosity — it is a testable model assumption. Checking sample mean ≈ sample variance is a quick diagnostic for whether Poisson applies. Overdispersion (variance >> mean) is extremely common in real count data, which is why negative binomial models appear so frequently in practice.