Modern Validity Frameworks and Integrated Evidence

Research Depth 77 in the knowledge graph I know this Set as goal
Unlocks 11 downstream topics
validity evidence-integration standards test-use

Core Idea

Contemporary validity frameworks (APA/AERA/NCME Standards) organize evidence into five sources: test content, response processes, internal structure, relations to other variables, and consequences of testing. This unified view synthesizes validity as an integrated evaluation of whether test scores support their intended interpretations and uses.

Explainer

Your earlier work on construct validity, criterion validity, and content validity gave you three historically separate concepts that were once treated as distinct *types* of validity — as if a test could be "criterion valid" independently of whether it was "content valid." The modern framework, codified in the *Standards for Educational and Psychological Testing* (APA/AERA/NCME), rejects this fragmentation. Validity is now understood as a single, unified property: the degree to which evidence supports the interpretation and use of test scores for a specific purpose. The five sources of evidence are not separate validity types — they are different evidentiary lines that collectively build or undermine the validity argument for a particular use.

Evidence from test content examines whether the items adequately represent the domain the test claims to measure. This is the conceptual heir to content validity — subject matter experts judge whether the test covers the right content in the right proportions. But content coverage alone cannot establish validity; a history exam might perfectly represent the curriculum and still produce scores that are uninterpretable because of poor item wording. Evidence from response processes addresses this gap: it examines whether examinees are actually using the cognitive or behavioral processes the test intends to invoke. Think-aloud protocols, eye-tracking, and cognitive interviews reveal whether a "math reasoning" item is solved through reasoning or through test-taking tricks. If examinees bypass the intended process, the score does not mean what you think it means.

Evidence from internal structure uses factor analysis and related methods (building on your measurement prerequisites) to evaluate whether the relationships among items and subscales match the theoretical model. If a test claims to measure three distinct abilities but all items load on a single factor, the three-score interpretation lacks structural support. Evidence from relations to other variables encompasses the convergent, discriminant, and criterion-related evidence you have studied separately — correlations with theoretically related and unrelated constructs, and with outcomes the test is supposed to predict. These external relationships are the most direct test of whether the score captures the intended construct.

Evidence from consequences is the most controversial source. It asks whether the actual use of the test produces the intended outcomes and does not produce harmful unintended ones. From your hypothesis testing background, you know that a statistical result is only meaningful relative to a purpose — the same is true for test validity. A test that validly predicts job performance but systematically underestimates performance for one demographic group is not simply "valid" in the abstract; the consequences of its use constitute validity evidence against its current application. This fifth source reflects validity theory's shift from asking "is this a valid test?" to asking "is this a valid use of this test with these people for this purpose?" — a fundamentally more demanding and contextual standard.

Practice Questions 5 questions

Prerequisite Chain

Counting to 10Counting to 20Understanding ZeroThe Number ZeroCounting to FiveOne-to-One CorrespondenceCombining Small Groups Within 5Addition Within 10Addition Within 20Two-Digit Addition Without RegroupingTwo-Digit Addition with RegroupingAddition Within 100Repeated Addition as MultiplicationMultiplication Facts Within 100Division as Equal SharingDivision as Grouping (Measurement Division)Division: Grouping (Repeated Subtraction) ModelDivision: Fair Sharing ModelDivision as Equal SharingDivision as GroupingBasic Division FactsDivision Facts Within 100Two-Digit by One-Digit DivisionDivision with RemaindersRemainders and Quotients in DivisionDivision Word ProblemsIntroduction to Long DivisionFactors and MultiplesPrime and Composite NumbersEquivalent FractionsRelating Fractions and DecimalsDecimal Place ValueReading and Writing DecimalsComparing and Ordering DecimalsAdding and Subtracting DecimalsMultiplying DecimalsDividing DecimalsDividing FractionsMixed Number ArithmeticOrder of OperationsInteger Order of OperationsVariable ExpressionsCombining Like TermsOne-Step EquationsTwo-Step EquationsSolving Multi-Step EquationsEquations with Variables on Both SidesAngle Pairs: Complementary, Supplementary, and VerticalParallel Lines and TransversalsCorresponding AnglesAlternate Interior AnglesTriangle Angle Sum TheoremExterior Angle TheoremTriangle Inequality TheoremSimilar Triangles: AA SimilaritySimilar Triangles: SSS and SAS SimilarityProportions in Similar TrianglesRight Triangle Trigonometry IntroductionTrigonometric Ratios ReviewRadian MeasureConverting Between Degrees and RadiansThe Unit CircleGraphing Sine and CosineGraphing Tangent and Reciprocal Trigonometric FunctionsDerivatives of Trigonometric FunctionsAntiderivativesIndefinite IntegralsBasic Integration RulesRiemann SumsDefinite Integral DefinitionProbability Density Functions and Continuous DistributionsCumulative Distribution FunctionsContinuous Random VariablesNormal DistributionClassical Test Theory FoundationsReliability and Validity: Foundational RelationshipConstruct Validity and Convergent-Discriminant EvidenceModern Validity Frameworks and Integrated Evidence

Longest path: 78 steps · 415 total prerequisite topics

Prerequisites (5)

Leads To (2)