5 questions to test your understanding
The fundamental theorem says finite VC dimension is equivalent to PAC learnability. A colleague points out that support vector machines learn well in infinite-dimensional feature spaces (via kernels). Does this contradict the theorem?
Which of the following is NOT equivalent to finite VC dimension according to the fundamental theorem?
The fundamental theorem of statistical learning applies to multi-class classification and regression problems, not just binary classification.
If a hypothesis class has finite VC dimension, the ERM algorithm (choosing the hypothesis with lowest training error) is guaranteed to be a successful PAC learner.
Explain why the equivalence between uniform convergence and learnability is the conceptual core of the fundamental theorem.