Questions: Cronbach's Alpha and Internal Consistency Reliability
5 questions to test your understanding
Score: 0 / 5
Question 1 Multiple Choice
A researcher develops a 40-item 'test anxiety' scale by writing eight slight variations of each of five core items (e.g., 'I feel nervous before tests,' 'I feel anxious before exams,' etc.). The scale produces Cronbach's alpha = .96. What is the most accurate evaluation of this scale?
AIt is an excellent scale — an alpha of .96 demonstrates outstanding reliability and thorough measurement
BThe high alpha likely reflects item redundancy: the scale is repeating the same narrow content rather than sampling the anxiety domain broadly, so precision is illusory
CThe alpha is artificially inflated because 40 items violates the assumptions underlying Cronbach's formula
DThe scale would be improved by removing items until alpha drops to the .70–.80 range
Alpha above .90 is a warning sign, not a trophy. When it arises from paraphrased items, it means the scale has sampled the domain very narrowly — it is measuring one question asked 40 ways. The apparent precision is illusory because items provide almost no additional information per item. A shorter, more diverse set of items covering the full anxiety domain would likely yield lower alpha but better construct coverage. Option A is the common misconception — treating higher alpha as uniformly better.
Question 2 Multiple Choice
Which of the following best describes the relationship between scale length and Cronbach's alpha?
AScale length has no effect on alpha — only the average inter-item correlation matters
BAdding more items always decreases alpha by introducing more measurement error
CAdding items that are at least moderately correlated with existing items increases alpha, even if the average inter-item correlation is unchanged
DAlpha is maximized by using exactly 10 items — more or fewer both reduce it
The k/(k−1) multiplier in the alpha formula means that longer scales produce higher alpha, even holding item quality constant. This is because more items sample the domain more thoroughly, reducing the proportion of error variance in the total score. The practical implication is that an alpha of .75 from a well-designed 5-item scale may represent better measurement than an alpha of .90 from a bloated 50-item scale with redundant items. Alpha reflects both item quality and scale length simultaneously.
Question 3 True / False
A Cronbach's alpha of .85 on a 10-item scale is consistent with the scale being either unidimensional or multidimensional — alpha alone cannot tell the difference.
TTrue
FFalse
Answer: True
Alpha measures whether items covary (internal consistency), not whether they measure a single underlying dimension. A scale mixing two moderately correlated factors can produce respectable alpha. Conversely, a genuinely unidimensional scale with heterogeneous item difficulties can produce low alpha. Dimensionality requires factor analysis to assess. This is why alpha should be treated as a necessary but not sufficient indicator of scale quality — it must be paired with structural validity evidence.
Question 4 True / False
Cronbach's alpha is the best single indicator of whether a psychological scale is measuring what it claims to measure.
TTrue
FFalse
Answer: False
Alpha measures internal consistency — whether items co-vary — but says nothing about whether they measure the right construct (validity). A scale whose items all measure social desirability rather than the intended construct might have excellent alpha and no validity whatsoever. Alpha also cannot detect multidimensionality, redundancy, or poor item wording. Construct validity requires correlations with external criteria, factor structure evidence, and theoretical alignment. Alpha is one reliability indicator, not a validity indicator.
Question 5 Short Answer
Why is Cronbach's alpha insufficient on its own to validate a psychological scale? What does it fail to tell you, and what additional evidence is needed?
Think about your answer, then reveal below.
Model answer: Alpha only confirms that items tend to rise and fall together — internal consistency. It says nothing about whether items measure the intended construct (construct validity), whether they tap a single dimension or multiple factors (dimensionality), or whether the inter-item covariance is meaningful or merely reflects redundancy. A scale measuring 'openness to experience' might achieve alpha = .88 by including items that all measure verbal ability, with no actual connection to openness. Factor analysis is needed to assess unidimensionality; correlations with external criteria and theory-based predictions are needed to assess construct validity.
The sequence for scale validation should be: (1) use alpha as a lower-bound reliability estimate; (2) use factor analysis to assess dimensionality; (3) test construct validity through convergent correlations (with similar constructs), discriminant correlations (with dissimilar constructs), and predictive validity (does it predict what it should?). Alpha at the end of step 1 is just the entry ticket — it does not substitute for steps 2 and 3.