Questions — Naive Bayes Classifier

Question 1 Multiple Choice

A spam classifier computes P(spam | features) = 0.000003 and P(not spam | features) = 0.000001 for a particular email. The classifier marks the email as spam. Despite the probabilities being wildly inaccurate (the true spam probability is 0.97), this classification is correct. Why?

ALaplace smoothing corrected the probability estimates before classification

BClassification only requires the correct class to have the highest score — even poorly calibrated probabilities preserve the correct ordering

CWorking in log space normalizes the probabilities before the decision is made

DThe naive Bayes assumption ensures probability estimates are accurate enough for practical classification

Question 2 Multiple Choice

A text classifier uses naive Bayes with a vocabulary of 50,000 words and 10 class labels. How many likelihood parameters must be estimated for the class-conditional distributions P(feature | class)?

A50,000 — one probability per word regardless of class

B500,000 — one probability per word-class combination

C50,000^10 — the full joint distribution across all words for each class

D10 — one class-conditional distribution treated as a single parameter

Question 3 True / False

Naive Bayes requires that its conditional independence assumption holds approximately in the data for it to achieve good classification accuracy.

TTrue

FFalse

Question 4 True / False

Without Laplace smoothing, a single word that appears in training data for class A but never for class B will cause naive Bayes to assign zero probability to class B for any document containing that word, regardless of all other evidence.

TTrue

FFalse

Question 5 Short Answer

Explain why naive Bayes is described as a 'good classifier but bad estimator.' What does this mean, and why does the independence assumption's violation not necessarily impair classification performance?

Think about your answer, then reveal below.

Questions: Naive Bayes Classifier