A Bayesian researcher reports: 'There is an 89% probability that the policy effect size is between 0.15 and 0.55 standard deviations.' A frequentist colleague responds: 'You cannot make direct probability statements about parameters — that's not how statistical intervals work.' Who is correct?
AThe frequentist colleague — neither framework permits direct probability statements about parameters
BThe Bayesian researcher — Bayesian credible intervals express the posterior probability that the parameter lies in the specified range, which is a valid and meaningful statement
CBoth are correct — Bayesian credible intervals and frequentist confidence intervals are mathematically equivalent with different labels
DThe frequentist colleague — only p-values provide meaningful probability statements about effect sizes
This is one of the most practically important distinctions in statistics. A frequentist 95% confidence interval means 'if we ran this study infinitely many times, 95% of computed intervals would contain the true parameter' — it is a statement about procedures, not about this specific interval. A Bayesian 89% credible interval means 'given the data and priors, there is an 89% posterior probability the parameter lies in this range' — a direct probability statement about the parameter. The Bayesian researcher is using the framework correctly; the frequentist critique would be valid if applied to a confidence interval but not to a credible interval.
Question 2 Multiple Choice
A researcher argues that using an informative prior based on three previous studies (all finding effects near 0.4) makes Bayesian analysis 'unscientifically subjective,' unlike frequentist methods. What is the strongest response?
AShe is correct — all priors introduce subjectivity that frequentist methods avoid by design
BBayesian priors are only legitimate when all prior studies used identical methodology
CFrequentist methods involve equivalent substantive assumptions — model specification, covariate selection, functional form — but state them implicitly rather than explicitly; informative priors based on existing evidence are a strength, not a flaw
DBayesian analysis should only use uninformative priors to remain objective
The claim that frequentist methods are uniquely objective is itself a misconception. Every statistical analysis embeds substantive assumptions: which predictors to include, what functional form to assume, what outcomes to measure. Bayesian analysis forces these choices into explicit prior distributions, where they can be examined and debated. Frequentist analysis embeds the same choices in model specification choices that are often less transparent. When prior research exists, incorporating it through an informative prior is epistemically responsible — the alternative is pretending you know nothing when you actually know something.
Question 3 True / False
A 95% Bayesian credible interval and a 95% frequentist confidence interval both express the probability that the true parameter value lies within the specified range.
TTrue
FFalse
Answer: False
This is the most common confusion between the two frameworks. A frequentist confidence interval does NOT state that there is a 95% probability the true parameter lies in the interval — the true parameter is fixed (not random), and the interval either contains it or does not. The '95%' refers to the long-run coverage rate of the procedure across hypothetical repeated experiments. A Bayesian credible interval, by contrast, treats the parameter as having a probability distribution (the posterior) and directly states that 89% (or 95%) of the posterior probability mass lies in the given range — a genuine probability statement about where the parameter likely is.
Question 4 True / False
When sample sizes are small, Bayesian posterior estimates will be more strongly shaped by the prior distribution, which is epistemically appropriate because small data should produce smaller belief updates.
TTrue
FFalse
Answer: True
This is a feature of Bayesian inference, not a limitation. The posterior is a weighted combination of the prior and the data. When data are abundant, the likelihood dominates and the posterior is concentrated around the data-supported value regardless of the prior. When data are sparse, the prior has more influence — which correctly reflects that you should not update your beliefs dramatically on the basis of weak evidence. This property is particularly useful in social science, where small samples from natural experiments or comparative case studies are common.
Question 5 Short Answer
Why are Bayesian hierarchical models particularly well-suited to social science phenomena like students nested within classrooms nested within districts, and what is the 'partial pooling' advantage they provide?
Think about your answer, then reveal below.
Model answer: Hierarchical Bayesian models represent nested structure by letting lower-level parameters (e.g., individual classroom effects) be drawn from a higher-level distribution (district-level effects), which is itself estimated from the data. Partial pooling means each group's estimate is a weighted average of its own data and the group-level average — borrowing strength from the full dataset without forcing all groups to be identical. This avoids the two bad alternatives: ignoring group structure entirely (pooling all data) or treating each group as completely independent (no pooling), which produces noisy estimates for small groups.
The practical advantage is that a classroom with only 10 students gets a more stable estimate by partially pooling toward the district average, rather than being estimated solely from 10 data points. As a classroom's sample size grows, its estimate moves toward its own data and away from the prior. This adaptive regularization addresses one of the core challenges in multilevel social data: extreme heterogeneity in group sizes, with some groups having abundant data and others very little.