Questions: Feature Scaling and Normalization

5 questions to test your understanding

Score: 0 / 5

Question 1 Multiple Choice

A data scientist computes mean and standard deviation from the full dataset (train + test combined), then splits into train/test sets and applies that scaler to both. What is the problem with this approach?

AThe scaler will produce different ranges for training and test data, causing model instability

BInformation from the test set leaks into the training process, producing overly optimistic performance estimates that won't hold on truly unseen data

CStandardization cannot be applied after a train/test split — the data must be split first, then scaled separately with different scalers

DThis approach is fine as long as the test set is large enough to represent the population

Question 2 Multiple Choice

Which type of machine learning model is LEAST sensitive to whether features are scaled?

AK-nearest neighbors (KNN)

BSupport vector machine with RBF kernel

CLogistic regression

DRandom forest

Question 3 True / False

Scaling must be applied inside each cross-validation fold — computing the scaler on the full training set before splitting into folds leaks information from each validation fold.

TTrue

FFalse

Question 4 True / False

Min-max normalization is preferred over standardization when the dataset contains significant outliers, because it compresses outliers into the [0, 1] range and prevents them from distorting the scaling.

TTrue

FFalse

Question 5 Short Answer

Why does failing to scale features harm distance-based algorithms like k-nearest neighbors, even when all features are genuinely important predictors?

Think about your answer, then reveal below.