Questions: Relationships Between Modes of Convergence
5 questions to test your understanding
Score: 0 / 5
Question 1 Multiple Choice
A sequence of random variables Xₙ converges in distribution to a standard normal N(0,1). Which of the following is guaranteed?
AXₙ converges in probability to some random variable X
BXₙ converges almost surely to some random variable X
CEach Xₙ is approximately normally distributed for large n
DConvergence in distribution to N(0,1) does not guarantee any of the above
Convergence in distribution is the weakest mode: it only requires CDFs to converge at continuity points. It does not guarantee convergence in probability or a.s. convergence (counterexamples exist). Option C is the subtlest trap: the Xₙ individually are not guaranteed to be approximately normal — they could be wildly non-normal random variables whose distribution functions happen to converge to Φ. Convergence in distribution is a statement about distributions, not about individual random variables being 'close' to anything.
Question 2 Multiple Choice
The typewriter sequence on [0,1] converges to 0 in probability. What does this sequence demonstrate about the relationship between convergence modes?
AConvergence in probability implies almost sure convergence
BAlmost sure convergence implies convergence in probability
CConvergence in probability does not imply almost sure convergence
DThe typewriter sequence also converges almost surely, so no implication fails
The typewriter sequence converges to 0 in probability (P(Xₙ = 1) → 0 as the intervals shrink) but NOT almost surely — for almost every point ω in [0,1], Xₙ(ω) = 1 infinitely often as the window sweeps back through each region. This is a canonical counterexample showing that convergence in probability does NOT imply almost sure convergence: the implication in the hierarchy runs the other way.
Question 3 True / False
Almost sure convergence implies convergence in probability.
TTrue
FFalse
Answer: True
This is one of the strict implications in the hierarchy. If Xₙ → X almost surely, then for every ε > 0, P(|Xₙ − X| > ε) → 0, which is precisely convergence in probability. The proof uses the continuity of probability measure. The converse fails — the typewriter sequence demonstrates convergence in probability without almost sure convergence.
Question 4 True / False
If Xₙ converges in distribution to a standard normal, then for large n, each Xₙ is approximately a standard normal random variable.
TTrue
FFalse
Answer: False
Convergence in distribution is a statement about the CDFs of the Xₙ converging to Φ — it says nothing about the individual random variables being 'close' to any standard normal. The Xₙ and the limit variable X do not even need to be defined on the same probability space. A sequence of Cauchy-distributed variables whose tails are somehow truncated could converge in distribution to N(0,1) without any individual variable being close to normal in any pathwise sense.
Question 5 Short Answer
Explain why the distinction between the strong law of large numbers (almost sure convergence) and the weak law (convergence in probability) is substantive, not merely technical, even though both say the sample mean 'converges to' the true mean.
Think about your answer, then reveal below.
Model answer: The strong law says: with probability 1, every single sample path's average converges to μ — convergence happens simultaneously for almost all realizations of the sequence. The weak law says: for each fixed ε, the probability that the sample average is far from μ shrinks to zero. The weak law allows scenarios where, for any fixed n, the sample average could be far from μ with small but nonzero probability, without those events ever becoming negligible simultaneously across all n. The strong law rules out any persistent erratic behavior; the weak law only controls the probability at each n individually.
The hierarchy matters for applications: the strong law licenses interpreting long-run frequencies as probabilities, while the weak law only guarantees approximate accuracy 'most of the time.' Proofs of results that require strong law (e.g., Glivenko-Cantelli) cannot be replaced by appeals to the weak law alone.