Questions: Modern Validity Frameworks and Integrated Evidence
5 questions to test your understanding
Score: 0 / 5
Question 1 Multiple Choice
A cognitive ability test shows strong criterion-related validity and good internal consistency. However, researchers discover that many test-takers solve the 'reasoning' items using a pattern-matching shortcut rather than the analytical reasoning the test is meant to measure. Which source of validity evidence is most directly threatened?
AEvidence from test content — the items do not adequately cover the reasoning domain
BEvidence from internal structure — factor analysis would reveal that items load on a single shortcut factor
CEvidence from response processes — examinees are not using the cognitive processes the test intends to invoke
DEvidence from relations to other variables — criterion correlations are inflated by the shortcut strategy
Evidence from response processes directly addresses whether examinees are engaging with test items the way the test designers intended. If a 'reasoning' test is being solved through pattern recognition rather than analytical reasoning, then the score does not measure what it claims to measure — regardless of criterion correlations or internal consistency. Think-aloud protocols and cognitive interviews are the primary methods for gathering this evidence. A test can look valid by other criteria while being fundamentally invalid at the process level.
Question 2 Multiple Choice
A personnel selection test has been thoroughly validated for predicting performance in entry-level software engineering roles. A new HR director decides to use the same test to identify candidates for promotion to senior engineer positions. According to the modern validity framework, what is the key concern?
AThe test needs to be re-normed for the senior engineer population before use
BValidity is specific to interpretations and uses; using the test for promotion requires building a new validity argument for that purpose
CThe test is invalid for this purpose because it was never designed for promotion decisions
DRe-validation is only needed if the test content or scoring has changed
The central shift in the modern framework is from asking 'Is this a valid test?' to asking 'Is this a valid use of this test with these people for this purpose?' A test thoroughly validated for predicting entry-level performance has not been validated for predicting senior-level performance — those are different constructs requiring different evidence. Validity evidence is built for a specific interpretation and use; borrowing it wholesale for a different context is a logical error, not just a technical shortcoming.
Question 3 True / False
Under the modern validity framework, a test that accurately predicts job performance but systematically underestimates performance for one demographic group provides validity evidence against its use in that application.
TTrue
FFalse
Answer: True
Evidence from consequences — the fifth source in the modern framework — asks whether the actual use of the test produces intended outcomes and avoids harmful unintended ones. Systematic underprediction for a demographic group is a consequence that bears on whether the score interpretation is valid for that group. The modern framework treats this as validity evidence, not merely a social or legal concern. A test that 'works on average' while producing systematically biased decisions for a subgroup is not fully valid for that use.
Question 4 True / False
Once a test demonstrates good criterion-related validity and strong internal consistency, no further validity evidence is needed to support its use.
TTrue
FFalse
Answer: False
The modern framework treats validity as an integrated argument built from multiple convergent sources. Criterion validity and internal structure are two of five sources; evidence from test content, response processes, and consequences can each reveal problems that the other two sources miss. A test might predict job performance well (criterion validity) while examinees bypass the intended cognitive processes (response process problem), or while producing harmful disparate impacts (consequences problem). No single source is sufficient — validity is always a cumulative case.
Question 5 Short Answer
What is the fundamental shift in how validity is understood in the modern APA/AERA/NCME framework compared to the older 'types of validity' approach, and why does this distinction matter for test use?
Think about your answer, then reveal below.
Model answer: The older approach treated content validity, criterion validity, and construct validity as separate, independent properties a test could possess. The modern framework reconceives validity as a single unified property: the degree to which evidence supports a specific interpretation and use of test scores. The five sources of evidence are not separate 'types' — they are converging lines of evidence in a validity argument. This matters because validity is now tied to a specific use, not the test itself. A test can be valid for one purpose and invalid for another, and the burden of building the validity argument falls on whoever is using the test.
The practical consequence is significant: organizations can no longer simply point to a published validation study and assume their use of the test is justified. They must ask whether the evidence supports their specific interpretation, with their specific population, for their specific purpose. This is a more demanding and contextual standard — which is exactly the point. Tests are powerful tools; the modern framework requires those using them to actively justify that power.