Questions: Factor Analysis and Dimensionality Reduction
5 questions to test your understanding
Score: 0 / 5
Question 1 Multiple Choice
A researcher uses EFA on a 20-item survey to discover a 3-factor structure, then runs CFA on the same dataset and obtains excellent fit (CFI = .97, RMSEA = .04). Can the researcher claim the 3-factor structure is confirmed?
AYes — the excellent CFA fit indices independently confirm that the factor structure is correct
BNo — using the same data for both EFA and CFA is circular: the CFA model was built to fit these correlations, so good fit is expected by construction, not evidence of generalizability
CYes — CFI > .95 is a universal benchmark proving the factor structure reflects reality
DNo — but only because CFA requires a minimum of 500 participants to produce valid fit statistics
This is the EFA-CFA circularity trap. EFA derives the factor structure by finding the best-fitting solution in a specific dataset. CFA then 'tests' whether that structure fits — but it's the same data the structure came from. Of course it fits: the model was optimized on those correlations. Genuine confirmation requires an independent sample where the pre-specified model is tested on data it has never seen. EFA is hypothesis-generating; CFA is hypothesis-testing; they require separate samples.
Question 2 Multiple Choice
A researcher applies Kaiser's rule (eigenvalue > 1) and retains 7 factors from a 25-item scale. A colleague suggests this may be too many. What should the researcher do?
AAccept the 7-factor solution — Kaiser's rule is the definitive standard for factor retention
BCross-check with a scree plot and parallel analysis, since Kaiser's rule often retains too many factors and the retention decision is a theory-informed judgment, not a mechanical rule
CHalve the number of factors to 3 or 4, since social constructs rarely have more than 4 dimensions
DUse only the first factor, since the first eigenvalue is always the most meaningful
Kaiser's rule (eigenvalue > 1) is widely used but widely criticized because it tends to over-retain factors — especially with larger item sets, many eigenvalues will exceed 1 by chance. Parallel analysis, which compares observed eigenvalues to those from random data of the same size, provides a more defensible baseline. The scree plot can reveal a natural elbow. Most importantly, the retention decision should be guided by theory — how many dimensions make conceptual sense? Factor retention is a judgment call informed by multiple criteria, not a deterministic formula.
Question 3 True / False
Oblique rotation is often preferable to orthogonal rotation in social science factor analysis because real psychological constructs are rarely completely uncorrelated with each other.
TTrue
FFalse
Answer: True
Orthogonal rotation (e.g., Varimax) constrains factors to be uncorrelated, which produces clean, simple loading patterns but imposes an unrealistic assumption when the constructs are theoretically related. Extraversion and positive affect, anxiety and neuroticism — these are not zero-correlated in reality. Oblique rotation (e.g., Oblimin, Promax) allows factors to correlate naturally, producing a more realistic solution. The trade-off is a slightly more complex interpretation (you need a pattern matrix and a structure matrix), but the gain in realism usually outweighs the cost.
Question 4 True / False
If an item loads .70 on a factor in exploratory factor analysis, this is strong evidence that the item is a valid measure of the psychological construct the factor represents.
TTrue
FFalse
Answer: False
This conflates statistical coherence with construct validity. A high loading means the item shares substantial variance with other items on that factor — they co-vary together. But what the factor 'is' depends on theoretical interpretation of the common thread among high-loading items, not on the loading values alone. High loadings are reliability-adjacent evidence (internal structure), not validity evidence. Validity requires showing the factor relates to external criteria in theoretically expected ways, which loading size cannot establish.
Question 5 Short Answer
Why must exploratory factor analysis and confirmatory factor analysis be conducted on separate samples to provide genuine evidence about a scale's factor structure?
Think about your answer, then reveal below.
Model answer: EFA discovers a factor structure by finding the solution that best fits the correlation patterns in a given dataset — it is data-driven and capitalizes on that sample's specific covariances. CFA tests whether a pre-specified structure fits new data. If the CFA model is derived from EFA on the same sample, the model was built to fit those correlations, and good CFA fit is tautological. An independent sample allows CFA to test whether the EFA-derived structure generalizes beyond the discovery data — which is the actual scientific question.
This is the discovery-vs-confirmation distinction: EFA generates hypotheses, CFA tests them. Running both on the same data conflates these roles. The same issue arises in any model-building context: a model always fits the data it was built on better than new data. Only independent replication distinguishes a model that captures real structure from one that merely describes sampling noise.