Selection Validity

Graduate Depth 32 in the knowledge graph I know this Set as goal
Unlocks 1 downstream topic
validity criterion-related-validity content-validity construct-validity

Core Idea

Selection validity refers to the degree to which a selection procedure actually measures or predicts what it is intended to measure or predict. The Uniform Guidelines recognize three validation strategies: criterion-related validity (demonstrating a statistical relationship between test scores and job performance), content validity (showing that the test representatively samples the job domain), and construct validity (establishing that the test measures the psychological construct it claims to measure). Validity is not a property of the test itself but of the inferences drawn from test scores in a specific context — a test valid for one purpose may be invalid for another.

Explainer

Validity is the most important concept in personnel selection — and one of the most misunderstood. The common phrasing "this test is valid" is imprecise. Validity is not an inherent property of a test; it is a property of the inferences drawn from test scores for a particular purpose in a particular context. A cognitive ability test might be highly valid for predicting performance in complex jobs but less valid for predicting performance in jobs with minimal cognitive demands. The question is always: valid for what?

The three validation strategies — criterion-related, content, and construct — are not competing alternatives but complementary lines of evidence. Criterion-related validity provides the most direct evidence: you demonstrate empirically that test scores predict job performance. This can be done predictively (test candidates, hire regardless of scores, then correlate with later performance) or concurrently (test current employees and correlate with their current performance). The predictive approach is methodologically stronger because it avoids restriction of range and motivation differences, but it requires patience and the willingness to hire without using the test — a hard sell for most organizations.

Content validity takes a different approach entirely. Instead of demonstrating a statistical relationship between test scores and performance, you argue that the test content faithfully represents the job content. This is most compelling for work sample tests and job knowledge tests where the overlap between test and job is visible and direct. A typing test for a secretary job is content-valid on its face. Content validity is established through expert judgment (SMEs evaluating the correspondence between test and job), not through correlational data. It is the appropriate strategy when criterion data are unavailable or when the test directly samples job tasks.

Construct validity is the broadest framework. It asks whether the test measures the theoretical construct it claims to measure — for example, whether a "conscientiousness" scale actually measures conscientiousness and not something else. Construct validity is established through a web of evidence: factor analyses showing the right internal structure, correlations with other measures of the same construct (convergent validity), low correlations with measures of different constructs (discriminant validity), and theoretically predicted relationships with external variables. In practice, construct validity subsumes the other two strategies as special cases.

The validity generalization movement, led by Schmidt and Hunter from the 1970s onward, transformed the field by challenging the doctrine of situational specificity. Prior to their work, practitioners believed that a test's validity had to be demonstrated locally — in each new organization, for each new job — because validity might not transfer across settings. Using meta-analysis, Schmidt and Hunter showed that much of the apparent variability in validity coefficients was artifactual, caused by sampling error, range restriction, and measurement error in the criterion. Once these artifacts were corrected, the remaining true variability was small. This meant that cognitive ability tests, for instance, could be confidently used in new settings without full local validation, dramatically expanding the practical reach of validated selection tools.

Practice Questions 4 questions

Prerequisite Chain

Longest path: 33 steps · 170 total prerequisite topics

Prerequisites (3)

Leads To (1)