A statistician computes a 95% frequentist confidence interval [0.3, 0.7] for a coin's bias θ. A colleague says: 'There's a 95% chance the true bias is between 0.3 and 0.7.' Is this correct?
AYes — a 95% CI always means a 95% probability that the parameter is in the interval
BNo — the interval either contains the true θ or it doesn't; the 95% describes the procedure's long-run coverage across hypothetical repetitions, not a probability about this specific interval
CYes — as long as the sample size was large, the CI approximates a Bayesian credible interval and the interpretation holds
DNo — the colleague should have said '95% probability of observing data consistent with the interval'
Once you observe a specific interval [0.3, 0.7], the true θ is a fixed constant — it either is or isn't in that interval. The 95% is a property of the *procedure*: if you ran the experiment many times and computed a CI each time, 95% of those intervals would contain the true θ. A Bayesian credible interval *can* make the probability statement 'P(θ∈[0.3,0.7]|data) = 0.95' — but only because it incorporates a prior and treats θ as a random variable. The frequentist interval cannot make probability statements about a fixed (non-random) θ.
Question 2 Multiple Choice
A researcher has a strong prior belief that a drug effect size θ is near 0.2, encoded as a tight prior. After seeing data that strongly suggests θ ≈ 0.8, what will the posterior look like?
AThe posterior will be centered at 0.2 — a strongly-held prior anchors the estimate regardless of data
BThe posterior will be centered near the likelihood's peak (≈0.8), since sufficient data overwhelms even an informative prior
CThe posterior will be bimodal, split between 0.2 and 0.8 to incorporate both signals equally
DThe posterior will equal the prior, because the likelihood cannot update a strongly informative prior
Posterior ∝ likelihood × prior. With sufficiently strong data (a sharply peaked likelihood), the likelihood overwhelms even an informative prior. The posterior shifts toward the data — not exactly to 0.8 (the prior still pulls), but substantially away from 0.2. This is Bayesian convergence: with enough data, posteriors from different priors converge toward the same answer. Option A is the common misconception about Bayesian updating — a strong prior provides more resistance, but it is not immovable. With strong enough data, the prior's influence shrinks.
Question 3 True / False
A 95% Bayesian credible interval [a, b] means that, given the observed data and the prior, there is a 95% probability that the true parameter θ lies between a and b.
TTrue
FFalse
Answer: True
This is exactly what a Bayesian credible interval means — and it's the interpretation most people intuitively want from an interval estimate. The interval is derived from the posterior distribution P(θ|data), and the coverage is a direct probability statement about θ's location given all available information. This contrasts with the frequentist confidence interval, which describes a long-run property of the estimation procedure, not a probability about any particular θ.
Question 4 True / False
If two researchers start with different priors but observe the same data, they will typically arrive at the same posterior distribution.
TTrue
FFalse
Answer: False
Different priors lead to different posteriors for the same data — posterior ∝ likelihood × prior, so if the prior differs, the product differs. With large amounts of data, the likelihood dominates and posteriors from 'reasonable' priors converge. But convergence with large samples is not the same as equality: with finite data, especially sparse data, different priors can produce substantively different posteriors. The choice of prior genuinely matters, particularly in small-sample or low-signal settings.
Question 5 Short Answer
Explain what it means for the posterior to be a probability distribution over θ, and why this is philosophically different from a frequentist point estimate.
Think about your answer, then reveal below.
Model answer: In Bayesian inference, θ is treated as a random variable with a probability distribution representing our uncertainty about its true value. The posterior π(θ|data) assigns probability to every possible value of θ — encoding not just a best guess but the full shape of our uncertainty. A frequentist treats θ as a fixed (non-random) unknown constant, so 'probability over θ' is a category error in that framework.
The practical consequence is that the posterior supports direct probability statements: 'θ is 80% likely to be above 0.5,' 'the most probable value is 0.3,' 'the 95% credible region is [a,b].' All of these are valid posterior summaries. A frequentist cannot make such statements about θ because θ isn't random. This difference becomes practically important when combining evidence (the posterior of one study becomes the prior for the next), propagating uncertainty through decisions, or communicating results to stakeholders who naturally think in terms of probability over parameters.