Questions: Split-Half Reliability and the Spearman-Brown Prophecy Formula
5 questions to test your understanding
Score: 0 / 5
Question 1 Multiple Choice
A 100-item test is split into odd and even halves. The correlation between the two half-scores is r = .76. What should be reported as the split-half reliability of the full 100-item test?
BThe split-half reliability is .76 — the correlation between the two halves is the reliability estimate
CThe split-half reliability is .76 / 2 = .38 — the correlation must be halved since each half is half as long
DNo reliability can be estimated from a single administration; a second administration is required
The raw correlation of .76 is the reliability of a 50-item test — half the actual length. Reporting .76 as the full test's reliability would systematically underestimate it. The Spearman-Brown formula corrects for this: 2(.76) / (1.76) ≈ .864. The formula works because reliability increases predictably with test length — doubling the number of parallel items is equivalent to the correction Spearman-Brown applies. Option B is the classic mistake: treating the half-test correlation as the full-test reliability.
Question 2 Multiple Choice
A researcher splits a 60-item test into items 1–30 versus items 31–60 and reports a correlation of .68. A colleague uses odd-even items instead and reports .74. Why does the split type matter?
AFatigue and strategy drift late in the test systematically depress scores in the second half, deflating the first-half/second-half correlation — not because the test is unreliable, but because the halves were taken under different conditions
BThe first-half/second-half split includes harder items in the second half, creating a difficulty imbalance that lowers validity
CThe odd-even split artificially inflates reliability by mixing item types across halves
DBoth splits should produce the same result; the difference reflects random sampling error
The first-half/second-half split confounds reliability with test-taking conditions. If participants tire, lose concentration, or change strategy as the test progresses, the second-half scores are systematically different from what they'd be under fresh conditions — and this systematic difference lowers the correlation, making the test look less reliable than it actually is. Odd-even splitting distributes the effects of fatigue and practice equally across both halves, removing the confound. This is a real measurement artifact, not trivial.
Question 3 True / False
The raw correlation between the two halves of a split-half reliability analysis underestimates the full test's reliability because it reflects only the consistency of a half-length test.
TTrue
FFalse
Answer: True
This is the core insight behind the Spearman-Brown correction. Reliability increases with test length — more items means more sampling of the construct, more averaging out of random error. The half-test correlation estimates how reliable 50 items are, not 100. Because a 100-item test is inherently more reliable than a 50-item version of the same test, reporting the raw half-test correlation as the full test's reliability is a systematic underestimate. The Spearman-Brown formula predicts how much reliability increases when you effectively double the test.
Question 4 True / False
The Spearman-Brown prophecy formula can primarily be used to predict the reliability of a test that is exactly twice as long as the test it was calibrated on.
TTrue
FFalse
Answer: False
The Spearman-Brown formula generalizes to any length multiplier, not just doubling. The full formula predicts the reliability of a test k times as long as the original: r_kk = kr / (1 + (k-1)r). The standard split-half application uses k = 2, but the same logic applies if you want to predict the reliability of a test three times as long (k = 3) or half as long (k = 0.5). This makes Spearman-Brown a general tool for test length planning, not just a split-half correction.
Question 5 Short Answer
Why is the Spearman-Brown correction necessary when reporting split-half reliability, and what would a researcher be claiming if they skipped it?
Think about your answer, then reveal below.
Model answer: The Spearman-Brown correction is necessary because the correlation between the two half-tests estimates the reliability of a test that is half as long as the actual test. Skipping it and reporting the raw correlation would implicitly claim that the full test is no more reliable than half of it — a systematic underestimate. The correction predicts the reliability of the full-length test by applying the mathematical relationship between test length and reliability: more items reduce the influence of any single item's measurement error, so the full test is reliably more reliable than either half.
The deeper principle is that reliability is a function of test length, holding all else constant. This is why longer tests are used for high-stakes decisions — a 100-item licensure exam is more reliable than a 10-item quiz measuring the same construct. Spearman-Brown makes this relationship quantitative and precise, allowing researchers to extrapolate from a half-test observation to a full-test prediction. It also works in reverse: if you need a test with reliability .90 and your 40-item version has reliability .80, Spearman-Brown tells you how many items to add.