What is the essential ingredient that distinguishes supervised learning from unsupervised learning?
AA neural network architecture
BLabeled training examples pairing inputs with correct outputs
CA very large dataset
DAn optimization algorithm like gradient descent
Supervised learning is defined by the use of labeled data — each training example provides both the input and the correct output (label). Unsupervised learning, by contrast, finds structure in data without any labels. Neural networks and gradient descent are implementation choices that can appear in both paradigms; dataset size is not what defines the paradigm.
Question 2 True / False
A model that achieves near-zero error on its training data is expected to perform well on new, unseen data.
TTrue
FFalse
Answer: False
This is the core overfitting problem. A model can memorize the training set — including its noise — while failing to capture the true underlying pattern. Near-perfect training performance often signals high variance (overfitting), not a good model. Generalization to unseen data requires balancing bias and variance, typically evaluated on a held-out validation or test set.
Question 3 Short Answer
What is the role of a loss function in supervised learning?
Think about your answer, then reveal below.
Model answer: A loss function quantifies the difference between the model's predictions and the true labels on training examples, providing a scalar signal that the learning algorithm minimizes to improve the model.
Without a loss function, there is no way to measure how wrong the model is or in which direction to update its parameters. The choice of loss function shapes what the model optimizes: mean squared error penalizes large errors heavily (common in regression), while cross-entropy is suited for classification probabilities. The loss function connects the abstract goal of 'learning' to a concrete mathematical objective.