A doctor uses a Bayesian model to estimate a patient's blood-pressure reduction from a new drug. The posterior is right-skewed. Underestimating the reduction is much more costly than overestimating it. Which Bayesian point estimator should she prefer?
AThe posterior mode (MAP), because it gives the single most probable value
BThe posterior mean, because minimizing squared error is always the correct clinical objective
CA quantile above the median — say the 75th percentile — to reduce the risk of underestimating the effect
DThe posterior median, because it is always robust to skew
When error costs are asymmetric, none of the standard estimators (mean, median, MAP) is automatically correct. The 'right' Bayesian estimator minimizes expected loss under the posterior, and if underestimating is much costlier than overestimating, the optimal choice biases toward higher values — a quantile above 0.5. This illustrates the key insight: the optimal Bayesian point estimate is always defined relative to an explicit loss function, and different cost structures produce different optimal estimators.
Question 2 Multiple Choice
For a binomial proportion p with a Beta(2, 2) prior and 3 successes in 10 trials, the posterior mean and the MAP estimate differ. What does this reveal?
AThe MAP is always closer to the data proportion than the posterior mean is
BThey optimize different loss functions: the posterior mean minimizes expected squared error; the MAP maximizes the posterior density (optimal under 0-1 loss)
CThe posterior mean always equals the MAP for Beta-Binomial conjugate models
DThe difference is a numerical artifact with no interpretive significance
Posterior mean = (2+3)/(2+2+10) = 5/14 ≈ 0.357; MAP = (2+3−1)/(2+2+10−2) = 4/12 ≈ 0.333. They differ because they optimize different objectives: the posterior mean minimizes E[(θ̂−θ)²] (squared error loss), while the MAP maximizes p(θ|data) (equivalently, minimizes 0-1 loss). Both are valid Bayesian estimators for different cost structures — the difference is meaningful, not accidental.
Question 3 True / False
The MAP (Maximum A Posteriori) estimate is the universally recommended Bayesian point estimator because it selects the single most probable parameter value.
TTrue
FFalse
Answer: False
MAP is optimal only under 0-1 loss, where any deviation from the exact true value incurs the same penalty. For squared error loss, the posterior mean is optimal; for absolute error loss, the posterior median is optimal. MAP is often used for computational convenience in high-dimensional settings, but calling it universally recommended confuses one specific loss function for all of them. The choice of estimator must be driven by the actual cost structure of the problem.
Question 4 True / False
As the number of observations grows large, the Bayesian posterior mean converges toward the frequentist maximum likelihood estimate, and the influence of the prior diminishes.
TTrue
FFalse
Answer: True
For a Beta(α, β) prior with n observations and k successes, the posterior mean is (α+k)/(α+β+n). As n → ∞ with k/n → p̂ (the data proportion), the posterior mean → p̂, the MLE. The prior contributes α+β pseudo-observations that become negligible as real data accumulates. This asymptotic convergence is a general property: with sufficient data, Bayesian and frequentist estimates agree regardless of the prior choice.
Question 5 Short Answer
Why does Bayesian point estimation require specifying a loss function, while frequentist maximum likelihood estimation does not? What does this reveal about each approach?
Think about your answer, then reveal below.
Model answer: MLE implicitly adopts a specific loss structure (it maximizes likelihood, which corresponds to minimizing 0-1 loss in a certain sense), so no separate specification is needed — the loss is baked in. Bayesian estimation starts from the full posterior distribution and must then choose how to collapse it to a single number. Different collapses optimize different objectives, so the choice must be made explicit. Making the loss function explicit is a strength: it forces the analyst to ask 'what kinds of errors matter here?' before reporting a number, rather than silently assuming all errors are equally costly.
Any point estimator encodes a preference about errors. Bayesian estimation makes this explicit through the loss function. This is especially valuable in decision-making contexts — clinical, legal, engineering — where the costs of overestimating and underestimating are genuinely different and should influence the choice of estimate.