Questions: Principal Component Analysis

5 questions to test your understanding

Score: 0 / 5
Question 1 Multiple Choice

You run PCA on a 100-feature dataset. The first 3 principal components explain 82% of total variance. A colleague says 'PCA found the 3 most important features.' What is wrong with this statement?

ANothing is wrong — PCA selects the 3 features with the highest variance
BPCA found 3 new axes that are linear combinations of all 100 original features, not a subset of 3 features
CPCA selects features by correlation, not by variance, so 82% refers to correlation explained
DThe colleague is right, except the number should be higher — PCA typically retains at least 10 features
Question 2 Multiple Choice

A dataset's true structure lies on a two-dimensional Swiss roll (a curved, spiral surface) embedded in three-dimensional space. You apply PCA to reduce to 2 dimensions. What will most likely happen?

APCA will perfectly recover the 2D structure, since the data truly lives in 2 dimensions
BPCA will fail to capture the intrinsic structure because it can only find flat (linear) subspaces, and no flat plane efficiently aligns with a curved manifold
CPCA will fail because it cannot handle 3D data — it only works on high-dimensional datasets
DPCA will succeed if you first normalize the features, since normalization linearizes the structure
Question 3 True / False

The first principal component is the eigenvector of the covariance matrix corresponding to the largest eigenvalue, and it points in the direction of maximum variance in the data.

TTrue
FFalse
Question 4 True / False

PCA removes noise from a dataset by keeping mainly the principal components with large eigenvalues and discarding the rest.

TTrue
FFalse
Question 5 Short Answer

Why must data be centered (mean-subtracted) before applying PCA, and what artifact arises if this step is skipped?

Think about your answer, then reveal below.