Questions: Double Descent Phenomenon

4 questions to test your understanding

Score: 0 / 4

Question 1 Short Answer

Classical bias-variance tradeoff predicts test error increases when model complexity exceeds a critical point. How does double descent reconcile this?

Think about your answer, then reveal below.

Question 2 Multiple Choice

Why can overparameterized models achieve both zero training error AND good test performance (generalization), seemingly violating the principle that memorization leads to poor generalization?

AMemorization and generalization are not actually contradictory; memorizing data with structure preserves that structure

BLarge models are 'implicit regularization machines' — gradient descent naturally finds solutions that generalize even when fitting noise, due to the geometry of high-dimensional spaces and early stopping

COverparameterized models cannot memorize perfectly; they are forced to learn only general patterns

DNoise in the data is automatically filtered during training, preventing memorization of noise

Question 3 Multiple Choice

At what model complexity does double descent occur?

AWhen model capacity exceeds data size by a factor of 10 or more

BWhen the interpolation threshold is reached — model capacity ~ data size — and beyond

COnly for neural networks; classical machine learning models do not exhibit double descent

DWhen regularization is entirely removed from training

Question 4 True / False

How does the ratio of parameters to training samples relate to double descent in practice?

TTrue

FFalse