Questions — Overparameterization Theory

Question 1 Short Answer

A neural network has 1 million parameters and is trained on 10,000 examples. Classical learning theory predicts severe overfitting. Under what conditions might the network still generalize well?

Think about your answer, then reveal below.

Question 2 Multiple Choice

Why does overparameterization make optimization EASIER, not harder?

AOverparameterization has no effect on optimization difficulty

BMore parameters mean fewer local minima, reducing the chance of getting stuck

COverparameterization increases the volume of good solutions, making them easier for SGD to find through random search

DLarger networks have flatter loss surfaces in overparameterized regimes, enabling faster gradient descent convergence

Question 3 Multiple Choice

What does the interpolation regime refer to in the context of overparameterization?

AThe regime where the network linearly interpolates between training examples

BThe regime where the network has enough capacity to fit (interpolate) all training examples while maintaining good test performance

CThe regime where batch size is held constant during training

DThe regime where the network weights converge to exact values

Question 4 Multiple Choice

Overparameterization theory suggests that implicit regularization prevents overfitting in overparameterized networks. Which of the following is NOT a form of implicit regularization?

AEarly stopping: halting training before convergence to prevent overfitting

BSGD noise: stochastic gradients add noise that regularizes the solution

CSmall weight initialization: initializing weights near zero biases toward low-norm solutions

DHigher learning rate: larger learning rates lead to faster convergence and less overfitting

Questions: Overparameterization Theory