Questions: Item Response Theory: Assumptions and Fundamentals
5 questions to test your understanding
Score: 0 / 5
Question 1 Multiple Choice
Two test items share a reading passage about climate change. A researcher finds they are highly correlated in raw data. Does this necessarily violate local independence?
ANo — local independence only requires independence in raw data, and high correlation is acceptable
BNo — local independence allows raw correlations as long as the correlation is fully explained by the latent ability θ; but if a passage-specific factor drives additional correlation beyond θ, local independence IS violated
CYes — local independence requires all items to be uncorrelated in raw data, and high correlations always violate it
DYes — items sharing content always violate local independence regardless of θ
Local independence is a *conditional* statement: given θ, item responses must be independent. Raw (marginal) correlations between items are expected and acceptable — they arise because people with higher θ tend to get multiple items right. The violation occurs when knowing one item's response gives extra information about another *above and beyond* what θ already tells you. A shared reading passage creates a passage-specific skill component that adds correlation beyond θ, violating local independence. Option C is the most common misconception — it confuses unconditional correlation with conditional dependence.
Question 2 Multiple Choice
What is the central practical payoff of IRT's stronger assumptions compared to classical test theory?
AIRT produces higher reliability coefficients than CTT for the same test
BIRT item difficulty and discrimination parameters are invariant across samples, and ability estimates are invariant across which items are used
CIRT eliminates the need for large sample sizes when calibrating tests
DIRT automatically detects and corrects for test bias without additional analysis
Parameter invariance is the key payoff that motivates IRT's stronger assumptions. In CTT, item statistics (difficulty, discrimination) change when you change the sample or the test. In IRT, item parameters estimated in one sample apply to another (assuming the assumptions hold), and a person's ability estimate is the same whether estimated from easy items, hard items, or a mix. This invariance is what enables computerized adaptive testing, item banking, and test equating — applications that CTT cannot support because its statistics are test- and sample-dependent.
Question 3 True / False
Local independence in IRT means that item responses is expected to be uncorrelated in the raw data — items measuring the same construct should show near-zero correlations.
TTrue
FFalse
Answer: False
Local independence is a conditional independence assumption, not an unconditional one. It states that P(X_i, X_j | θ) = P(X_i | θ) × P(X_j | θ): given a person's ability level θ, knowing their answer to one item provides no additional information about their answer to another. Items measuring the same construct will naturally be correlated in raw data — people with higher ability tend to get more items right. That raw correlation is entirely expected. Local independence is violated only when correlation remains even after conditioning on θ, as when items share a stimulus (passage, figure) that creates a local ability cluster beyond the common trait.
Question 4 True / False
Under IRT assumptions, an item's difficulty parameter estimated from one sample of test-takers can be applied to a different population without re-estimation.
TTrue
FFalse
Answer: True
This is the principle of parameter invariance — the defining advantage of IRT over classical test theory. In CTT, item difficulty is defined as the proportion of the sample that answered correctly, which changes whenever the sample changes. In IRT, item difficulty is the value of θ at which a person has a 50% probability of a correct response (for the 1PL model), which is a property of the item itself, independent of who was tested. When IRT assumptions (unidimensionality, local independence, monotone IRF) hold, the same item parameters apply across groups, enabling applications like test equating and item banking.
Question 5 Short Answer
Explain why unidimensionality is the most fundamental assumption of IRT, and what 'approximate' unidimensionality means in practice.
Think about your answer, then reveal below.
Model answer: Unidimensionality means all items measure a single underlying latent trait θ — one common factor accounts for all covariation among item responses. It is the most fundamental assumption because the IRT model is built around a single ability parameter; if responses are driven by multiple distinct abilities, the single-θ model is misspecified and parameter estimates lose their meaning. In practice, perfect unidimensionality is never achieved — most tests have a dominant factor with minor secondary ones (e.g., a math test may also require reading skill). 'Approximate' unidimensionality means the secondary factors are small enough that the dominant factor captures the bulk of variance; IRT is empirically robust to this. It breaks down when secondary factors are substantial, which is why confirmatory factor analysis is run before fitting IRT models.
The reason this matters practically is that violations of unidimensionality produce biased ability estimates and misleading item parameters. For example, if a science test has two equally strong factors (quantitative reasoning and verbal comprehension), lumping them into one θ produces estimates that conflate two real abilities. Students strong in one but not the other will have unstable estimates depending on item sampling. The practical standard is to check factor structure, confirm there is a clearly dominant first factor, and note the proportion of variance explained before proceeding with IRT.