Questions: Dropout Regularization

5 questions to test your understanding

Score: 0 / 5
Question 1 Multiple Choice

Why does dropout specifically prevent co-adaptation between neurons?

AIt randomly selects the most important neurons and discards the rest permanently
BIt forces each neuron to be useful independently of any particular subset of other neurons, since it cannot rely on specific partners always being present
CIt penalizes pairs of neurons with highly correlated activations directly in the loss function
DIt reduces the network to a single canonical subnetwork that is trained to convergence
Question 2 Multiple Choice

A model trained with 50% dropout (p = 0.5) is deployed for inference. What is the correct procedure for using the model's weights at test time?

AApply dropout with p = 0.5 and average predictions over many forward passes
BActivate all neurons but multiply each weight by 0.5 (or equivalently, scale activations by 2 during training with inverted dropout)
CRemove all dropout layers entirely and retrain for fine-tuning
DActivate all neurons without any weight adjustment, since training already converged
Question 3 True / False

Dropout reduces overfitting by permanently removing redundant neurons from the network, resulting in a smaller, more regularized model after training.

TTrue
FFalse
Question 4 True / False

Dropout can be interpreted as simultaneously training an ensemble of 2ⁿ different thinned subnetworks, all sharing the same underlying weights.

TTrue
FFalse
Question 5 Short Answer

Explain why dropout is less effective (or even harmful) in small networks compared to large, overparameterized networks.

Think about your answer, then reveal below.