Questions: Construct Definition and Measurement Development
5 questions to test your understanding
Score: 0 / 5
Question 1 Multiple Choice
A researcher builds a 20-item scale for 'academic resilience' that achieves excellent test-retest reliability (r = .94). A reviewer argues the scale actually measures general optimism rather than resilience. This criticism is best described as a problem of:
ALow reliability — a reliable scale would not drift toward measuring optimism
BConstruct-irrelevant variance — the scale captures something outside the intended construct's boundaries
CConstruct under-representation — the scale misses important facets of resilience
DOperational redundancy — the items overlap too much with each other
Construct-irrelevant variance occurs when a measure captures variance from something outside the intended construct — here, optimism rather than resilience. This is a validity problem, not a reliability problem. High reliability means the scale measures *something* consistently; it says nothing about whether that something is what the researcher intended. A reliable scale measuring the wrong construct is arguably more dangerous than an unreliable one, because it generates false confidence in the findings.
Question 2 Multiple Choice
When developing a measure of a new psychological construct, which step should come FIRST?
AWrite a large pool of candidate items and factor-analyze them to discover the construct's structure
BAdminister an existing related scale to check whether correlation is high enough to justify a new measure
CWrite a nominal definition that specifies what the construct includes and excludes theoretically
DRecruit a pilot sample and compute Cronbach's alpha to establish an internal consistency baseline
The nominal definition — a clear theoretical statement of what the construct is and is not — must precede all other steps. Without it, item writing has no principled basis for inclusion or exclusion, and the resulting scale may systematically miss important facets or capture adjacent constructs. Jumping straight to item writing (option A) is the most common mistake; factor analysis can only find structure in what was measured, it cannot recover facets that were never included.
Question 3 True / False
A highly reliable measure is very likely to be a valid measure of the intended construct.
TTrue
FFalse
Answer: False
Reliability and validity are distinct properties. Reliability means a measure produces consistent results; validity means it measures what it claims to measure. A bathroom scale that always reads 10 pounds too high is perfectly reliable but systematically invalid. In psychology, a scale can reliably measure mood when it was intended to measure depression — consistent results, wrong target. Reliability is necessary but not sufficient for validity.
Question 4 True / False
Construct under-representation occurs when a measure fails to sample systematically from the full domain of the construct, leaving important facets unmeasured.
TTrue
FFalse
Answer: True
Construct under-representation is one of the two main threats to construct validity (the other being construct-irrelevant variance). A depression scale that only measures mood while ignoring cognitive, somatic, and behavioral symptoms under-represents the construct — it will perform poorly in populations where somatic symptoms are primary, and will miss important clinical distinctions. Good content coverage requires mapping the construct's domain before writing items.
Question 5 Short Answer
Why must construct definition precede item writing rather than follow it, even when researchers plan to validate the scale empirically afterward?
Think about your answer, then reveal below.
Model answer: A nominal definition determines which facets belong in the construct's domain and which are excluded. Without this boundary, items written 'intuitively' may systematically over-sample easy-to-measure facets (like mood) while under-sampling others (like somatic or cognitive symptoms). Once a scale has been deployed and accumulated validity evidence, its implicit construct definition becomes extremely difficult to revise — the scale takes on a life of its own. Validation studies then test what the scale measures, not what the construct should include, which can entrench measurement error rather than correct it.
The deeper issue is that empirical validation cannot substitute for theoretical clarity. Validation checks whether a scale behaves consistently with related and unrelated constructs, but it cannot tell you whether the items adequately represent the theoretical domain — that judgment requires the nominal definition. Researchers who define their constructs after seeing how their items cluster are fitting the definition to the data, not the data to the definition.