Norm-referenced interpretation compares a score to a reference group, answering "How does this person compare?" Criterion-referenced interpretation judges performance against absolute standards, answering "Can this person do X?" Each serves different purposes: norm-referenced for selection and ranking, criterion-referenced for diagnosis and competency assessment.
Examine test manuals and compare how different tests report results using norm-referenced vs. criterion-referenced approaches. Discuss which interpretation is appropriate for specific decisions.
You've studied how scores acquire meaning through validity design — that a number by itself tells you nothing until you know what it's being compared to or what it's supposed to predict. Norm-referenced and criterion-referenced interpretation are two philosophically distinct answers to the question "what does this score mean?" and choosing the wrong framework for a given purpose produces systematically misleading information.
Norm-referenced interpretation answers: "How does this person compare to others?" The score derives its meaning entirely from a reference group — the normative sample. An IQ of 115 means "one standard deviation above the mean for this population." A percentile rank of 82 means "higher than 82% of the comparison group." The absolute level of performance is secondary to relative standing. This framework is essential for selection and ranking decisions — scholarship competitions, competitive admissions, hiring from a large applicant pool — because it directly answers "who performs best, relative to whom."
Criterion-referenced interpretation answers a different question: "Can this person do X?" Performance is judged against an absolute standard of competence, not against others. A passing score of 70% on a driver's test means the person demonstrated sufficient skill, regardless of whether most others passed or failed. A student either meets the third-grade reading standard or does not — where they rank among their peers is irrelevant to that judgment. Criterion-referenced tests are designed around the definition of competence, not around maximizing individual differences.
The distinction shapes test design in a concrete way. Norm-referenced tests must include items that spread scores across individuals — items that discriminate between people. An item that everyone gets right contributes nothing to ranking and is typically removed from a norm-referenced instrument. Criterion-referenced tests include items that map onto the competency domain, even if nearly all trained individuals get them right, because the question is whether the person has acquired that competency, not whether they score higher than someone else. Understanding this difference explains why the same content can be tested very differently depending on the interpretive purpose — and why the choice of framework must be driven by the decision the score is meant to inform, not by convention or convenience.