Construct validity addresses the degree to which a measured variable actually represents the theoretical construct it is intended to measure. Operationalization is the process of translating abstract psychological constructs into concrete, observable, and measurable variables. Poor operationalization—where the measured variable incompletely captures or distorts the construct—creates a gap between theory and measurement that undermines valid research. Multiple operationalizations of the same construct can strengthen evidence that observed effects reflect the underlying construct rather than specific measurement details.
Select an abstract construct (e.g., intelligence, anxiety, self-esteem) and brainstorm multiple ways to operationalize it, then evaluate which best represents the full construct.
Construct validity is the same as reliability (actually, reliability is necessary but not sufficient for construct validity—a reliable measure may not be valid). A well-validated measure is universally valid across all contexts (actually, construct validity is context-dependent and can vary).
You've already worked with operational definitions — the process of translating abstract constructs into concrete procedures — and with validity as a general concept in measurement. Construct validity is where those threads converge into the central question of psychological measurement: does your measure actually capture the psychological entity you're theorizing about? It's a deceptively hard question, because psychological constructs like "anxiety," "intelligence," or "empathy" don't exist as physical objects you can hold next to your measure and compare. You can never directly verify that you've measured a construct correctly; you can only accumulate evidence that your measure behaves the way the theory predicts it should.
Think about the construct "self-esteem." A researcher could operationalize it as a self-report scale, a reaction time task comparing responses to positive and negative self-words, physiological stress responses, or behavioral persistence on difficult tasks. Each operationalization captures something, but no single one captures everything the construct implies. The gap between the operationalization and the full construct is construct-irrelevant variance (things your measure picks up that aren't part of the construct) and construct underrepresentation (parts of the construct your measure misses). A good operationalization minimizes both, but achieving that requires first having a precise theoretical account of what the construct includes and excludes.
Convergent validity is the evidence that your measure correlates appropriately with other measures of the same construct — different operationalizations should tell a consistent story. Discriminant validity is the complementary evidence that your measure does *not* correlate strongly with measures of theoretically distinct constructs. If your anxiety measure correlates just as strongly with depression measures as with other anxiety measures, you may be measuring "general negative affect" rather than anxiety specifically. This is why the multitrait-multimethod matrix (Campbell & Fiske) is a classic validation design: it separates method variance (shared because the same method was used) from trait variance (shared because the same construct was measured), providing cleaner evidence of what a measure is actually capturing.
The relationship between reliability and construct validity is asymmetric and important. Reliability is a necessary precondition for validity — a measure that gives random, inconsistent scores cannot be measuring anything real — but reliability does not guarantee validity. A measure can be perfectly reliable (consistent) while measuring the wrong construct entirely. A ruler reliably measures length, but it's an invalid measure of intelligence even though it produces highly consistent numbers. This is why the common practice of reporting Cronbach's alpha as "validity evidence" is mistaken: it speaks to internal consistency of items, not to whether the items are converging on the right construct.
Finally, construct validity is not a permanent property of a measure — it is context-dependent and must be re-established when a measure is applied to new populations, cultures, or settings. A validated measure of "academic motivation" in Western university students may not be a valid measure of that construct among children in a different cultural context, where the social meaning of school performance differs substantially. This means validation is an ongoing program of research, not a one-time certification. Multiple operationalizations of the same construct — showing convergence across different methods and samples — provide stronger validity evidence than any single well-designed study, because they reduce the chance that apparent construct validity is an artifact of one particular method or sample.