Questions: Construct Validity and Measurement Validity
5 questions to test your understanding
Score: 0 / 5
Question 1 Multiple Choice
A researcher develops a new 'grit' scale with Cronbach's alpha of .92 and concludes it has strong construct validity. What is the critical flaw in this reasoning?
AHigh alpha indicates the items are homogeneous, but the shared dimension they measure might be conscientiousness or general self-efficacy rather than grit
BAlpha above .90 is excessively high and indicates the items are too redundant to be useful
CConstruct validity applies only to experimental manipulations, not self-report scales
DHigh alpha directly proves discriminant validity from related traits like perseverance
Cronbach's alpha measures internal consistency — how much the items correlate with each other — not what they are measuring. A scale measuring 'the tendency to agree with any negative statement' could have alpha of .95 and still not measure grit. Construct validity requires evidence that the measure captures the intended construct and not something else. That requires convergent validity (does it correlate with other grit indicators?) and discriminant validity (does it fail to over-correlate with conscientiousness, self-efficacy, and other distinct constructs?).
Question 2 Multiple Choice
A new depression scale correlates r = .78 with two existing depression measures, but also correlates r = .74 with anxiety scales and r = .71 with neuroticism scales. This pattern suggests:
AStrong construct validity, because the depression correlations are slightly higher than the anxiety and neuroticism correlations
BWeak convergent validity, since correlations with existing depression measures should exceed .90
CWeak discriminant validity — the scale likely measures general negative affect rather than depression specifically
DStrong construct validity, because correlating with related constructs is expected and desirable
In the Campbell-Fiske framework, valid measurement requires that same-trait correlations (convergent validity) substantially exceed different-trait correlations (discriminant validity). Here, the depression correlations (.78) are barely higher than anxiety (.74) and neuroticism (.71) correlations — the measure cannot discriminate depression from related negative-affect constructs. This is the classic signature of poor discriminant validity: the scale is probably measuring a broader dimension like general negative affect, not depression specifically.
Question 3 True / False
Method variance can inflate correlations between psychological constructs when all measures are collected using the same method — for example, all self-report Likert scales administered in the same session.
TTrue
FFalse
Answer: True
When multiple constructs are all measured via self-report in a single session, their intercorrelations are inflated by shared variance that belongs to the measurement method rather than the constructs themselves. Sources include acquiescence bias, extreme response tendencies, and momentary mood affecting all ratings simultaneously. This is why the multitrait-multimethod approach — using behavioral observation, physiological measures, or informant ratings alongside self-report — is the gold standard for establishing construct validity.
Question 4 True / False
A measure that has been validated on a college student sample can generally be considered valid for use with clinical populations, because the statistical relationships between constructs should hold across groups.
TTrue
FFalse
Answer: False
Validity is specific to populations, contexts, and uses — not general or permanent. A construct like 'depression' may have different manifestations, different factor structures, or different relationships to criterion variables in clinical populations compared to undergraduates. Validity generalization — whether evidence from one context transfers to another — is itself an empirical question. Calling a measure 'validated' without specifying for whom and for what purpose is misleading.
Question 5 Short Answer
Why isn't it enough to show that a new anxiety measure has high internal consistency and correlates well with other anxiety measures? What additional evidence is needed, and why?
Think about your answer, then reveal below.
Model answer: High internal consistency and convergent validity (correlating with other anxiety measures) are necessary but not sufficient. Discriminant validity evidence is also required: the scale must not over-correlate with distinct constructs like depression, neuroticism, or general negative affect. Without discriminant evidence, you cannot rule out that the measure captures a broader shared dimension rather than anxiety specifically. Construct validity requires triangulating the construct from multiple directions — what the measure relates to AND what it does not relate to.
A measure that correlates equally with anxiety, depression, and neuroticism probably isn't measuring any of them specifically. The multitrait-multimethod logic requires both convergent validity (high correlations with other indicators of the same construct) and discriminant validity (lower correlations with different constructs). Only when both patterns are present can you be confident the measure is operationalizing the intended construct and not a broader, theoretically uninteresting dimension.