Questions: Test Development and Specification Tables
5 questions to test your understanding
Score: 0 / 5
Question 1 Multiple Choice
A test developer writes 80 questions about a nursing certification exam, then organizes them into content areas and cognitive levels. Another developer creates a specification table first, then writes items to fill each cell. Which approach better supports content validity, and why?
AThe first approach, because it produces more authentic items drawn from real expertise
BBoth approaches are equivalent if the final item distribution matches the same grid
CThe second approach, because the blueprint operationalizes the domain before items are written, ensuring systematic coverage
DThe second approach only if the specification table was reviewed by test-takers
The blueprint must be created before item writing — this is the entire point of specifications. Writing items first and then categorizing them is item-driven, not domain-driven: the coverage reflects whatever topics the writer happened to think of, not a systematic sampling plan. This produces unknown gaps and concentrations. A pre-written specification table forces explicit decisions about what the domain includes and in what proportions, which is the foundation of content validity.
Question 2 Multiple Choice
What does the two-dimensional structure of a specification table represent?
AThe correlation between item difficulty and item discrimination across test forms
BThe mapping of content areas (topics) against cognitive levels (e.g., Bloom's taxonomy)
CThe relationship between test-taker ability and probability of correct response
DThe timeline of item development from draft to final form
A specification table is a grid: one axis is the content sub-domains (the 'what' of the construct — e.g., pharmacology, infection control), the other axis is the cognitive level (the 'how deeply' — recall, application, analysis). Each cell specifies how many items should fall there. This structure guarantees that the test samples both the right topics and the right kinds of thinking. Items in the 'pharmacology × application' cell look very different from items in 'pharmacology × recall,' even if both concern the same drug.
Question 3 True / False
A test blueprint must be completed and reviewed before any items are written.
TTrue
FFalse
Answer: True
This is the essential procedural point: the specification table is an a priori document that defines what the test should measure. Writing items before the blueprint means the domain is defined by whatever the item writers happen to produce, which cannot guarantee systematic coverage or defensible content validity. Pre-writing the blueprint also creates the documented record that is essential for legal defensibility in high-stakes testing.
Question 4 True / False
Two tests are parallel forms if they are built by the same item writers and cover the same general subject area, even if they have different proportions of recall versus application items.
TTrue
FFalse
Answer: False
Parallel forms must be built to the same specifications — the same number of items per content area and cognitive level, the same difficulty distribution, and ideally similar statistical properties. 'Same general subject area' is far too loose a criterion. A test heavy in recall items and one heavy in analysis items measure different things from the same domain, producing non-comparable scores. True parallel forms require a shared, detailed blueprint enforced during item selection.
Question 5 Short Answer
Why are test specifications essential for constructing parallel forms of a high-stakes examination?
Think about your answer, then reveal below.
Model answer: Parallel forms must measure the same construct with the same precision so that different test-takers can be compared on a common standard. A specification table ensures both forms have identical numbers of items per content area and cognitive level, identical difficulty distributions, and equivalent construct coverage. Without a shared blueprint, there is no way to verify that two forms measure the same thing — one might assess mostly recall while the other assesses mostly application, producing incomparable scores that cannot support fair decisions.
This is the practical payoff of specifications: enabling defensible, legally sound parallel forms for licensure and certification testing where different examinees receive different items. Students who understand this see the specification table not as bureaucratic overhead but as the mechanism that makes equitable large-scale assessment possible.