Questions — Cross-Validation and Out-of-Sample Model Evaluation

Question 1 Multiple Choice

A researcher compares two models: Model A has 3 predictors and in-sample R² = 0.73. Model B has 25 predictors and R² = 0.91. Model B's 10-fold cross-validation error is 60% higher than Model A's. For a forecasting application, which model should they choose?

AModel B — higher R² always indicates a better-fitting, more accurate model

BModel A — lower CV error means it generalizes better to new data

CModel B — more predictors capture more real variation in the data

DIt depends on which individual coefficients are statistically significant

Question 2 Multiple Choice

What specific problem does cross-validation detect that in-sample R² cannot?

AWhether the model's coefficients are statistically significant at the 5% level

BWhether the model has overfitted — fitting noise specific to the sample rather than the true underlying pattern

CWhether omitted variable bias is affecting the coefficient estimates

DWhether the error terms satisfy the Gauss-Markov homoskedasticity assumption

Question 3 True / False

Adding more predictor variables to a regression usually improves out-of-sample predictive performance because additional variables cannot reduce the model's explanatory power.

TTrue

FFalse

Question 4 True / False

A model selected by minimizing cross-validation error will typically outperform a model selected by maximizing in-sample R² when making predictions on new data.

TTrue

FFalse

Question 5 Short Answer

Explain why in-sample R² is a misleading measure of a model's predictive quality, and what cross-validation reveals instead.

Think about your answer, then reveal below.

Questions: Cross-Validation and Out-of-Sample Model Evaluation