Questions — Data Augmentation Techniques

Question 1 Multiple Choice

A researcher is training a model to classify handwritten digits (0–9). Which of the following augmentations would be INAPPROPRIATE because it could introduce incorrect training signal?

ARandomly adding Gaussian noise to pixel values

BRandomly scaling the image by 5–10%

CHorizontally flipping the image (left-right mirror)

DSlightly adjusting the brightness of the image

Question 2 Multiple Choice

A team applies aggressive data augmentation to a small medical imaging dataset — including random rotations up to 180°, horizontal/vertical flips, and random color channel inversions — and finds that model accuracy on the validation set *decreases* compared to training without augmentation. What is the most likely explanation?

AData augmentation always reduces accuracy in small datasets; it only helps with large datasets

BSome augmentations destroyed label semantics (e.g., flipping a chest X-ray changes its clinical interpretation), introducing incorrect training signal

CThe model was too small to learn from augmented data and needed more parameters

DAugmentation increases training time, causing the model to underfit due to insufficient epochs

Question 3 True / False

Data augmentation reduces overfitting by increasing the effective variety of the training set, making it harder for the model to memorize specific examples.

TTrue

FFalse

Question 4 True / False

Any image transformation that a human would still correctly label is a valid augmentation for model training — the primary requirement is that the label is preserved.

TTrue

FFalse

Question 5 Short Answer

Explain why domain knowledge is essential when choosing data augmentation strategies, using a specific example where an augmentation would be appropriate in one domain but harmful in another.

Think about your answer, then reveal below.

Questions: Data Augmentation Techniques