Questions: Dropout Regularization

5 questions to test your understanding

Score: 0 / 5

Question 1 Multiple Choice

Why does dropout specifically prevent co-adaptation between neurons?

AIt randomly selects the most important neurons and discards the rest permanently

BIt forces each neuron to be useful independently of any particular subset of other neurons, since it cannot rely on specific partners always being present

CIt penalizes pairs of neurons with highly correlated activations directly in the loss function

DIt reduces the network to a single canonical subnetwork that is trained to convergence

Question 2 Multiple Choice

A model trained with 50% dropout (p = 0.5) is deployed for inference. What is the correct procedure for using the model's weights at test time?

AApply dropout with p = 0.5 and average predictions over many forward passes

BActivate all neurons but multiply each weight by 0.5 (or equivalently, scale activations by 2 during training with inverted dropout)

CRemove all dropout layers entirely and retrain for fine-tuning

DActivate all neurons without any weight adjustment, since training already converged

Question 3 True / False

Dropout reduces overfitting by permanently removing redundant neurons from the network, resulting in a smaller, more regularized model after training.

TTrue

FFalse

Question 4 True / False

Dropout can be interpreted as simultaneously training an ensemble of 2ⁿ different thinned subnetworks, all sharing the same underlying weights.

TTrue

FFalse

Question 5 Short Answer

Explain why dropout is less effective (or even harmful) in small networks compared to large, overparameterized networks.

Think about your answer, then reveal below.