Construct validity examines whether a test truly measures the psychological construct it purports to measure. Multitrait-multimethod matrices provide evidence by showing correlations between different measures of the same construct (convergent) and low correlations with measures of different constructs (discriminant).
You already know from your study of validity that a test being reliable doesn't guarantee it measures what it claims to measure. Construct validity asks the deeper question: does the test actually capture the psychological reality it's supposed to represent? A test of "anxiety" might reliably produce scores, but if those scores reflect social desirability, attention, or willingness to self-disclose more than anxiety itself, the construct validity is poor. Establishing construct validity requires accumulating a body of evidence — and the multitrait-multimethod (MTMM) matrix is the most systematic approach to gathering that evidence simultaneously.
The MTMM framework, introduced by Campbell and Fiske (1959), rests on a simple but powerful design: measure multiple psychological traits using multiple different methods. For example, you might measure three traits — anxiety, depression, and hostility — using three methods: self-report questionnaire, peer rating, and behavioral observation. This gives you a 9×9 correlation matrix. Within this matrix, you can identify two critical patterns. Convergent validity is demonstrated when two different methods measuring the same trait correlate highly — a self-report anxiety score should correlate well with peer-rated anxiety, because both are trying to capture the same construct. If they don't, something is wrong: either the construct is method-bound, or the different methods are measuring different things.
Discriminant validity is demonstrated when measures of different traits, even using the same method, do not correlate too highly. Anxiety and depression should correlate somewhat (they often co-occur) but not so highly that the scores are interchangeable. If self-report anxiety and self-report depression correlate at .90, you haven't measured two distinct constructs — you've created two labels for the same score. The key logic is this: if method variance (the shared variance from using the same measurement approach) exceeds trait variance (the shared variance from measuring the same construct across methods), the instrument is measuring the measurement method more than it's measuring the construct.
In practice, MTMM analysis shows that both threats are real. Self-report measures of different personality traits often correlate too highly with each other simply because they share the same method — people who are defensive or socially desirable respond similarly to any self-report item, regardless of what construct it targets. The corrective implication is that strong construct validity evidence requires converging on a construct from multiple methodological angles. When a behavioral observation, a peer rating, and a physiological index all point in the same direction, confidence in the construct increases substantially. This multi-method logic underpins the entire enterprise of psychological construct validation.