Measurement Validity: Construct and Criterion Evidence

College Depth 90 in the knowledge graph I know this Set as goal
Unlocks 4 downstream topics
validity construct-validity criterion-validity measurement-evidence

Core Idea

Construct validity asks: Does the measure assess the intended construct? Evidence comes from content validity, convergent validity (correlates with related measures), discriminant validity (uncorrelated with unrelated measures), and factor structure. Criterion validity asks: Does the measure predict relevant outcomes? Both are integral to score interpretation and use.

How It's Best Learned

Review validation studies for a psychological measure, extracting evidence of construct and criterion validity. Compare a measure with high internal consistency but low validity to understand that reliability ≠ validity. Practice evaluating whether a measure is valid for a new use.

Common Misconceptions

Explainer

Validity is often summarized as "does the test measure what it claims to measure?" but this framing obscures something important: validity is not a property of a test in isolation. It is a property of the interpretations and uses made from test scores. A depression measure might have strong validity evidence in clinical adult populations but poor validity when used with adolescents or in non-Western cultural contexts. From your study of reliability, you know that a measure can be highly consistent without measuring anything meaningful — a bathroom scale that consistently reads 10 pounds too heavy is reliable but systematically invalid.

Construct validity is the umbrella concept. It asks: does the pattern of relationships this measure forms with other variables make sense given our theoretical understanding of the construct? Evidence accumulates through multiple lines. Content validity evaluates whether the items cover the theoretical domain adequately — a math anxiety scale that only asks about algebra anxiety has poor content coverage if the construct is meant to encompass all mathematical domains. Convergent validity asks whether the measure correlates with other measures of the same or similar constructs; a new depression scale should correlate strongly with the BDI and PHQ-9. Discriminant validity (sometimes called divergent validity) asks the opposite: the measure should *not* correlate strongly with theoretically unrelated constructs. A depression scale with a .80 correlation with an anxiety scale raises questions about whether the two constructs are actually distinct.

Criterion validity is a separate but related question: does the measure predict relevant real-world outcomes? Concurrent validity examines correlation with a gold-standard criterion measured at the same time — does a new brief cognitive screening tool correlate with a full neuropsychological battery administered simultaneously? Predictive validity examines whether the measure predicts future outcomes — does a pre-employment personality scale predict actual job performance one year later? The distinction matters practically: a measure can have strong construct validity but weak predictive validity if the construct itself doesn't strongly cause the outcome you care about.

The unifying framework from contemporary psychometrics is that validity evidence is cumulative and argument-based. No single study "validates" a measure; rather, validation is an ongoing process of assembling a coherent validity argument — a chain of claims from test scores to interpretations to uses, with evidence supporting each link. When validity evidence is missing for a specific use case (a new population, a new purpose, a new context), the burden falls on the test user to either generate that evidence or acknowledge the inferential gap. This is why the phrase "this test is valid" is technically imprecise — the proper phrasing is always "the interpretation of these scores as measuring X in this population for this purpose has strong/weak validity evidence."

Practice Questions 5 questions

Prerequisite Chain

Counting to 10Counting to 20Understanding ZeroThe Number ZeroCounting to FiveOne-to-One CorrespondenceCombining Small Groups Within 5Addition Within 10Addition Within 20Two-Digit Addition Without RegroupingTwo-Digit Addition with RegroupingAddition Within 100Repeated Addition as MultiplicationMultiplication Facts Within 100Division as Equal SharingDivision as Grouping (Measurement Division)Division: Grouping (Repeated Subtraction) ModelDivision: Fair Sharing ModelDivision as Equal SharingDivision as GroupingBasic Division FactsDivision Facts Within 100Two-Digit by One-Digit DivisionDivision with RemaindersRemainders and Quotients in DivisionDivision Word ProblemsIntroduction to Long DivisionFactors and MultiplesPrime and Composite NumbersEquivalent FractionsRelating Fractions and DecimalsDecimal Place ValueReading and Writing DecimalsComparing and Ordering DecimalsAdding and Subtracting DecimalsMultiplying DecimalsDividing DecimalsDividing FractionsMixed Number ArithmeticOrder of OperationsInteger Order of OperationsVariable ExpressionsCombining Like TermsOne-Step EquationsTwo-Step EquationsSolving Multi-Step EquationsEquations with Variables on Both SidesAngle Pairs: Complementary, Supplementary, and VerticalParallel Lines and TransversalsCorresponding AnglesAlternate Interior AnglesTriangle Angle Sum TheoremExterior Angle TheoremTriangle Inequality TheoremSimilar Triangles: AA SimilaritySimilar Triangles: SSS and SAS SimilarityProportions in Similar TrianglesRight Triangle Trigonometry IntroductionTrigonometric Ratios ReviewRadian MeasureConverting Between Degrees and RadiansThe Unit CircleGraphing Sine and CosineGraphing Tangent and Reciprocal Trigonometric FunctionsDerivatives of Trigonometric FunctionsAntiderivativesIndefinite IntegralsBasic Integration RulesRiemann SumsDefinite Integral DefinitionProbability Density Functions and Continuous DistributionsCumulative Distribution FunctionsContinuous Random VariablesNormal DistributionCentral Limit TheoremConfidence Intervals for MeansZ-Tests and T-Tests for MeansOne-Sample Z-Test for MeansOne-Sample and Two-Sample T-TestsInferential Statistics in PsychologyEffect Size and Statistical PowerSample Size Determination in Research PlanningLiterature Review and Research SynthesisHypothesis Construction: Directional and Nondirectional PredictionsOperationalizing Independent and Dependent VariablesConstruct Definition and Measurement DevelopmentConstruct Validity and Measurement ValidityConstruct Validity and Operationalization of Psychological ConstructsVariables: Definition, Operationalization, and MeasurementMeasurement Reliability: Types and EstimationMeasurement Validity: Construct and Criterion Evidence

Longest path: 91 steps · 428 total prerequisite topics

Prerequisites (3)

Leads To (1)