Questions: Bootstrap Methods for Statistical Inference
5 questions to test your understanding
Score: 0 / 5
Question 1 Multiple Choice
You have a sample of 400 observations and want a 95% confidence interval for a complex nonlinear estimator that has no closed-form variance formula. You run the nonparametric bootstrap with B = 4,999 replications. What do the bootstrap replications use as their source of data?
ASimulated draws from a normal distribution fitted to the sample mean and variance
BRepeated draws of 400 observations with replacement from the original 400-observation sample
CRepeated draws of 400 observations without replacement, creating non-overlapping subsamples
DThe full population, approximated using the sample's empirical distribution function
The nonparametric bootstrap resamples from the original data with replacement. Each bootstrap sample has the same size (n=400) as the original, but some observations appear multiple times and others not at all. The key insight: if the original sample approximates the population, then resampling from the sample approximates taking new samples from the population. Without replacement (option C) creates subsamples with different properties. Simulating from a fitted normal distribution (option A) is the parametric bootstrap, which requires assuming a distributional form.
Question 2 Multiple Choice
Why does the standard (nonparametric) bootstrap fail for time-series data without modification?
ATime-series have too few observations for resampling to be reliable
BResampling individual observations independently breaks the serial correlation structure that time-series estimators depend on
CBootstrap confidence intervals are asymmetric, which conflicts with time-series symmetry
DThe bootstrap requires stationarity, and all time-series are non-stationary by definition
The standard bootstrap draws observations independently and randomly. In time-series data, observations are serially correlated — the value at time t depends on values at t-1, t-2, etc. Resampling individual observations independently destroys this dependence structure, creating bootstrap samples that behave nothing like the actual data-generating process. The block bootstrap addresses this by resampling contiguous blocks of observations, preserving the within-block correlation while still generating variation across blocks.
Question 3 True / False
Bootstrap standard errors are valid for complex estimators with no closed-form variance formula, including ratios and nonlinear transformations of parameters.
TTrue
FFalse
Answer: True
This is a major practical advantage of the bootstrap. Classical variance formulas (like OLS standard errors) rely on specific algebraic structure in the estimator. For complex estimators — ratios of parameters, quantile regression coefficients, nonlinear GMM estimators, sample medians — deriving a closed-form standard error is often difficult or impossible. The bootstrap sidesteps this entirely: compute the estimator on each bootstrap resample, then take the standard deviation across replications. No algebraic derivation needed — the computational procedure works for any well-behaved estimator.
Question 4 True / False
By generating thousands of bootstrap resamples, the bootstrap creates additional information beyond what is contained in the original sample, improving the precision of the estimator.
TTrue
FFalse
Answer: False
The bootstrap does not manufacture new information — it only reorganizes and exploits information already in the original sample. Running more bootstrap replications (B = 999 vs. B = 9,999) improves the precision of the bootstrap standard error estimate itself, but does not change the underlying sampling distribution or reduce the estimator's true variability. If the original sample is small or unrepresentative, no number of bootstrap replications can fix that. The fundamental limit is always the quality and size of the original sample.
Question 5 Short Answer
Explain the fundamental insight behind the nonparametric bootstrap: what problem does it solve, and what key assumption must hold for it to be valid?
Think about your answer, then reveal below.
Model answer: The bootstrap solves the problem of approximating the sampling distribution of an estimator when we only have one sample instead of many. Normally, a sampling distribution requires imagining what an estimator would look like across repeated samples from the population — but in practice we have only one dataset. The bootstrap's insight is that if the sample is representative of the population, then the distribution of the estimator across resamples drawn from the sample approximates the distribution across samples drawn from the population. The key assumption is representativeness: the original sample must be an approximately unbiased snapshot of the population. The bootstrap cannot compensate for a biased or unrepresentative sample.
This insight — 'treat the sample as if it were the population, then simulate repeated sampling from it' — is what distinguishes bootstrap inference from classical derivation-based inference. Classical methods require knowing the form of the sampling distribution (e.g., assuming normality) or invoking asymptotic approximations. The bootstrap substitutes computation for mathematical derivation, making inference possible for estimators where the analytic route is blocked. The representativeness assumption is the bootstrap's Achilles heel: garbage in, garbage out.