Questions: Ability Parameter Estimation and Theta Estimation Methods
5 questions to test your understanding
Score: 0 / 5
Question 1 Multiple Choice
A student answers every item correctly on a 20-item adaptive test. Which of the following best describes what happens when MLE is used to estimate their ability?
AMLE produces the highest possible theta estimate, since a perfect score unambiguously indicates maximum ability
BMLE is undefined, because the likelihood function increases without bound as theta increases — there is no finite maximum
CMLE produces a theta of +3, which is the conventional upper bound for ability estimates
DMLE and EAP produce the same estimate for perfect scores
MLE finds the theta that maximizes the likelihood of the observed response pattern. For a perfect score, every item was answered correctly, and the probability of each correct response increases monotonically with theta. The joint likelihood therefore never reaches a maximum — it keeps rising as theta → +∞. There is no finite MLE estimate. This is why operational testing systems use EAP (which imposes a prior and returns a finite value) or WLE (which corrects bias without a prior) for extreme scores.
Question 2 Multiple Choice
Two examinees take the same test. Examinee A has ability near the mean (θ ≈ 0); Examinee B has extreme high ability (θ ≈ +3). Whose theta estimate has a smaller standard error, and why?
AExaminee B's, because extreme ability means all items are easy, removing ambiguity
BBoth have the same standard error, since they took the same test — this is what the test's single reliability coefficient captures
CExaminee A's, because more items are well-targeted near θ ≈ 0, providing more information at that ability level
DIt depends entirely on how many items the examinee got correct, not on where they fall on the scale
In IRT, measurement precision is theta-dependent. Items provide the most information near their difficulty parameter, so items clustered around the mean difficulty provide high information for middle-ability examinees and low information for extreme-ability examinees. This is a fundamental departure from classical test theory, which summarizes precision with a single reliability coefficient applied uniformly across the score range. The item information function (the next topic) formalizes this relationship.
Question 3 True / False
In IRT ability estimation, measurement precision (standard error of the theta estimate) is the same for most examinees who take the same test, just as classical test theory's single reliability coefficient applies uniformly across the score range.
TTrue
FFalse
Answer: False
This is exactly what IRT improves upon. In IRT, the standard error of estimation varies across the ability scale — it is smallest where item information is concentrated (typically near the mean of item difficulties) and largest at the extremes where few items are well-targeted. Two people with very different theta values taking the same test are measured with different precision. Classical test theory's single reliability coefficient obscures this by averaging across all ability levels, which is one reason IRT is preferred for adaptive testing where different examinees see different item sets.
Question 4 True / False
EAP (Expected A Posteriori) estimation produces biased theta estimates for examinees with truly extreme abilities, pulling their estimates toward the center of the distribution.
TTrue
FFalse
Answer: True
EAP multiplies the likelihood by a prior distribution (typically a standard normal) before computing the expected value. For most examinees, the prior adds stability. But for truly extreme examinees — those near ±3 or beyond — the prior pulls the estimate toward the center even when the data clearly point to the extreme. This shrinkage bias is the price of the stability EAP provides. Researchers working with high-ability or low-ability subgroups should be aware that EAP systematically underestimates the extremes of the distribution.
Question 5 Short Answer
Explain why MLE breaks down at perfect and zero scores, and describe one approach that handles this limitation.
Think about your answer, then reveal below.
Model answer: MLE finds the theta maximizing the likelihood of the observed responses. For a perfect score, every item was answered correctly, and since the probability of a correct response increases with theta for all items, the joint likelihood increases monotonically with theta — there is no finite maximum, so MLE is undefined. For a zero score, the likelihood decreases monotonically, again with no finite minimum. EAP handles this by introducing a population prior (usually a standard normal): the posterior has a finite maximum even for extreme response patterns because the prior assigns decreasing probability to extreme theta values. WLE handles it by correcting the first-order bias in MLE, which also stabilizes boundary behavior, without importing distributional assumptions.
The core issue is that likelihood surfaces become monotone at the boundaries — they never 'turn over' to create a peak. Both EAP (via Bayesian shrinkage) and WLE (via bias correction) modify the objective function in ways that guarantee finite estimates, at the cost of either distributional assumptions (EAP) or a small residual bias correction (WLE). Neither is perfect; the choice depends on whether the application tolerates shrinkage toward the mean.