A dataset contains two classes arranged as concentric circles — class A is the inner circle, class B is the outer ring. A logistic regression model is trained on this data. What will happen?
ALogistic regression will find the correct circular boundary because it optimizes over all possible boundaries
BLogistic regression will perfectly separate the classes once given enough training data
CLogistic regression will fail to perfectly classify the data no matter how much training data is provided, because its boundary is constrained to be a straight line
DLogistic regression will overfit and produce a jagged circular boundary around each class
Logistic regression is a linear classifier — its decision boundary is always a hyperplane (a line in 2D). Concentric circles are not linearly separable; no single straight line can correctly divide the inner circle from the outer ring. Adding more training data does not help, because the problem is with the model's inductive bias, not insufficient data. A nonlinear model (kernel SVM, neural network, or even k-NN) is required. This illustrates why understanding decision boundary shapes reveals a model's fundamental limitations.
Question 2 Multiple Choice
A k-nearest-neighbors classifier trained on a small, noisy dataset produces a highly irregular decision boundary with many small islands around individual points. A second model on the same data produces a single smooth curved boundary. Which model is more likely to generalize better to new data, and why?
AThe irregular k-NN boundary, because it captures all the structure in the training data
BThe smooth boundary, because complex boundaries tend to overfit noise rather than capture true class structure
CThey will generalize equally well, because both models saw the same training data
DThe irregular k-NN boundary, because more complex boundaries always reflect more information
A decision boundary that contorts to accommodate every training point is overfitting — it is fitting noise as if it were signal. On new data, those noise-driven islands and jagged edges will misclassify points that should fall in the majority class region. The smooth boundary reflects a stronger inductive bias toward simpler structure, which tends to generalize better unless the true class boundary is genuinely complex. Visualizing decision boundaries makes this bias-variance tradeoff tangible.
Question 3 True / False
A linear classifier will misclassify some points in a non-linearly separable dataset no matter how long it is trained.
TTrue
FFalse
Answer: True
This is a fundamental property of linear classifiers, not a training failure. If the true decision boundary is nonlinear (classes overlap or are interleaved in ways no hyperplane can separate), the linear model's boundary cannot represent the correct partition. Training longer refines the placement of the line, but the line is still a line — it cannot curve to match a circular or spiral boundary. Adding more data also doesn't help: more examples of a problem the model cannot represent just confirms it. This is the model's inductive bias at work.
Question 4 True / False
A more complex decision boundary usually leads to better classification performance because it can capture more patterns in the data.
TTrue
FFalse
Answer: False
More complex boundaries can fit training data better, but they tend to overfit — capturing noise and idiosyncrasies of the training set that do not generalize to new data. A model with a very complex boundary may achieve near-perfect training accuracy while performing poorly on a held-out test set. The optimal boundary is the simplest one that correctly represents the true structure of the problem, not the most complex one that fits every training point. This is the bias-variance tradeoff: high-complexity models reduce bias but increase variance.
Question 5 Short Answer
What does the shape of a classifier's decision boundary reveal about the model, and why is this geometrically useful for understanding classification?
Think about your answer, then reveal below.
Model answer: The decision boundary's shape reveals the model's inductive bias — its built-in assumptions about the structure of the classification problem. A linear boundary assumes the classes are linearly separable; a staircase boundary (decision tree) assumes class regions align with feature axes; a smooth curved boundary (kernel SVM, neural network) assumes classes can be separated by smooth nonlinear surfaces. Visualizing the boundary in 2D makes the tradeoff between underfitting (too simple a boundary, can't capture class structure) and overfitting (too complex a boundary, memorizes noise) concrete and interpretable. It allows you to diagnose whether a model's failures are due to fundamental representational limits or excess flexibility.
The boundary is a geometric signature of the model. Examining where the boundary falls relative to the data reveals both what the model learned and what it cannot learn. A straight line that bisects a spiral dataset shows you immediately that the model is underfitting. A boundary that carves tiny islands around individual points shows overfitting. This geometric intuition extends to higher dimensions where direct visualization is impossible but the same underlying principles apply.