Two items have identical difficulty (b = 0). Item A has discrimination a = 2.0; Item B has discrimination a = 0.3. A test designer needs to measure ability near θ = 0 as precisely as possible. Which item should they choose, and why?
AItem B, because its shallow ICC provides useful information across a wider range of ability levels
BItem A, because its steep ICC sharply differentiates examinees near θ = 0, and item information scales with a²
CNeither — when b is the same, both items contribute equally to precision
DItem B, because high-discrimination items only work well for examinees far from the difficulty value
Item information = a² × P(θ)(1 − P(θ)), which peaks at θ = b. At θ = 0 (where b = 0 for both items), Item A provides (2.0)² = 4 times the baseline information, while Item B provides (0.3)² ≈ 0.09 times — about 44 times less. A high-discrimination item separates examinees sharply: a small increase in ability produces a large increase in probability of success. A low-discrimination item (flat ICC) gives almost the same probability regardless of ability, providing little diagnostic signal.
Question 2 Multiple Choice
An item has discrimination a = 0.2 (very low) and difficulty b = 1.0. An examinee at θ = 3.0 (well above the difficulty) and another at θ = −1.0 (well below) both attempt this item. What would the 2PL model predict about their probabilities of success?
AThe θ = 3.0 examinee has probability near 1.0; the θ = −1.0 examinee has probability near 0.0 — the item sharply differentiates them
BBoth examinees have similar probabilities of success, differing by less than 0.2, because the ICC is nearly flat with a = 0.2
CThe item gives a 50% probability to everyone since it cannot discriminate
DOnly examinees at θ = b = 1.0 can be measured; the item provides no information elsewhere
With a = 0.2, the ICC is nearly flat. P(θ=3) = 1/(1 + exp(−0.2×(3−1))) = 1/(1+exp(−0.4)) ≈ 0.60. P(θ=−1) = 1/(1 + exp(−0.2×(−1−1))) = 1/(1+exp(0.4)) ≈ 0.40. A 20-percentage-point difference across a 4-unit ability span is minimal — this item barely distinguishes high from low ability. This is what low discrimination means: the ICC slope is shallow, and the item provides almost no information about who is better or worse. The misconception is assuming that 'above difficulty' always means 'near-certain success.'
Question 3 True / False
An item with low discrimination (a ≈ 0.2) provides nearly the same probability of correct response for examinees across a wide range of ability levels.
TTrue
FFalse
Answer: True
The discrimination parameter controls the slope of the ICC. When a is small, the logistic curve is nearly horizontal — even large differences in ability produce small differences in probability of correct response. This is precisely what makes low-discrimination items poor for measuring ability: they cannot tell apart examinees who are very different in underlying ability. The 2PL model captures this variation in slope that the Rasch model (where all items share the same slope) ignores.
Question 4 True / False
When all discrimination parameters in a 2PL model are constrained to be equal (a = 1 for all items), the model is equivalent to the Rasch model.
TTrue
FFalse
Answer: True
The Rasch model's ICC is P(X=1|θ) = 1/(1 + exp(−(θ − b))), which matches the 2PL formula P(X=1|θ) = 1/(1 + exp(−a(θ − b))) when a = 1 for all items. The Rasch model is a special case of the 2PL. This is why the Rasch model's assumption of equal discrimination is often checked empirically — if items actually vary in discrimination, constraining a = 1 produces biased ability estimates.
Question 5 Short Answer
Why does item discrimination matter more than item difficulty when evaluating an item's contribution to test precision?
Think about your answer, then reveal below.
Model answer: Item information — the contribution to ability estimation precision — scales with a², not linearly. Doubling discrimination quadruples information; halving it reduces information to one-quarter. Difficulty (b) only shifts where on the ability scale information is concentrated; it does not change how much information the item provides. A perfectly placed item (b matches the target population) is nearly useless if discrimination is very low, while a high-discrimination item provides concentrated, reliable measurement near its difficulty value.
The item information function is a² × P(θ)(1 − P(θ)). The P(θ)(1 − P(θ)) term peaks at 0.25 when θ = b, but this maximum contribution is scaled by a². An item with a = 2 contributes 4× more information at its peak than an item with a = 1. This is why test designers prioritize items with high discrimination — they provide the most precise ability estimates per item administered.