A researcher has panel data on 500 firms over 5 years and estimates how lagged profits predict current profits using the within (fixed effects) estimator. A colleague warns the estimates will be biased. Why?
AThe within estimator requires T → ∞ to be consistent with a lagged dependent variable; with T = 5, the demeaned lagged dependent variable is mechanically correlated with the demeaned error, producing bias that does not vanish as N grows
BFE estimation cannot handle lagged dependent variables at all — the model is misspecified regardless of sample size
CWith 500 firms, the within estimator has too many fixed effects to estimate consistently
DThe bias vanishes as N → ∞, so 500 firms is sufficient to eliminate the problem
This is Nickell bias: when T is fixed and small, first-differencing or demeaning to remove αᵢ creates a mechanical correlation between the demeaned lagged dependent variable and the demeaned error. Specifically, both contain εᵢ,T-1 with opposite signs, and this correlation does not shrink as N increases — it is a fixed-T problem. The bias is approximately −(1+α)/(T−1), so with T = 5 and α near 1, the bias can be severe. Adding more firms (larger N) does not help.
Question 2 Multiple Choice
A researcher applies Arellano-Bond estimation and the AR(2) test on first-differenced residuals is strongly rejected. What does this imply?
ANothing — the AR(2) test is a goodness-of-fit diagnostic, not a validity test
BThe instruments are invalid: AR(2) in differenced residuals implies AR(1) in the original errors, meaning Yᵢₜ₋₂ is correlated with εᵢₜ and cannot serve as a valid instrument
CThe model needs more lags as instruments to absorb the additional serial correlation
DThe estimator should switch to pooled OLS because the panel structure is inappropriate
The instrument validity in Arellano-Bond rests on the assumption that the original errors εᵢₜ are not serially correlated. If εᵢₜ has AR(1) correlation, then εᵢₜ and εᵢₜ₋₁ are correlated, which means Yᵢₜ₋₂ (which depends on εᵢₜ₋₂) is not necessarily uncorrelated with Δεᵢₜ = εᵢₜ − εᵢₜ₋₁. AR(2) in first-differenced residuals (Δεᵢₜ and Δεᵢₜ₋₂ correlated) is the diagnostic fingerprint of AR(1) in the original errors — a rejection of AR(2) invalidates the standard instrument set.
Question 3 True / False
The Arellano-Bond estimator addresses Nickell bias by applying fixed effects (within) estimation after first-differencing to cleanly remove the individual fixed effects αᵢ.
TTrue
FFalse
Answer: False
This conflates two distinct estimators. The within estimator (fixed effects) is precisely the estimator that *creates* Nickell bias when a lagged dependent variable is present — it demeans the data, but the demeaned lagged DV remains correlated with the demeaned error. Arellano-Bond uses *first-differencing* to remove αᵢ and then applies GMM, using lagged levels of Y as instruments for the endogenous differenced lagged DV. The key is the instrumental variables step — first-differencing alone is necessary but not sufficient.
Question 4 True / False
In an Arellano-Bond model, using more lag levels as instruments is generally better because it incorporates more information from the data.
TTrue
FFalse
Answer: False
As T grows, the instrument count grows quadratically, creating instrument proliferation. A very large instrument matrix relative to the number of groups N leads to two problems: the Hansen/Sargan test statistic becomes biased toward non-rejection (it overfits), and finite-sample bias increases. Practitioners routinely limit the lag depth to the first two or three lags regardless of T to keep the instrument count manageable. The rule of thumb is that the instrument count should not exceed N.
Question 5 Short Answer
Why does first-differencing eliminate the Nickell bias problem, and what new endogeneity problem does it create that requires instrumental variables?
Think about your answer, then reveal below.
Model answer: First-differencing eliminates αᵢ because the fixed effect appears in both Yᵢₜ and Yᵢₜ₋₁; subtracting gives ΔYᵢₜ = αΔYᵢₜ₋₁ + ΔX'ᵢₜβ + Δεᵢₜ with no αᵢ. But this creates a new problem: ΔYᵢₜ₋₁ = Yᵢₜ₋₁ − Yᵢₜ₋₂ is correlated with Δεᵢₜ = εᵢₜ − εᵢₜ₋₁ because both share εᵢₜ₋₁ (with opposite signs). So ΔYᵢₜ₋₁ is endogenous in the differenced equation. The solution is to instrument it: Yᵢₜ₋₂ (and earlier lags) are correlated with ΔYᵢₜ₋₁ but uncorrelated with Δεᵢₜ, as long as the original errors are not serially correlated — making them valid internal instruments available within the dataset.
The elegance of Arellano-Bond is that it solves both problems using only the data already available: differencing removes the fixed effect, and the lags that were already collected provide the instruments. No external instruments are required. The tradeoff is that the approach only works for large-N, small-T panels, and instrument validity depends critically on the absence of serial correlation in the original errors.