A sequence {Xₙ} converges to 0 in probability. Which of the following is necessarily true for large n?
AXₙ = 0 almost surely — the random variable eventually equals its limit
BThe variance of Xₙ approaches 0
CFor any ε > 0, P(|Xₙ| > ε) → 0 as n → ∞
DXₙ converges to 0 almost surely — every sample path eventually stays near 0
Option (c) is the literal definition of convergence in probability, so it is necessarily true. Option (a) is wrong — Xₙ can still take non-zero values; what matters is that the probability of large deviations vanishes, not that Xₙ equals zero. Option (d) would be almost sure convergence — a strictly stronger notion. Option (b) is not necessarily true: consider Xₙ = n with probability 1/n and 0 otherwise. This converges to 0 in probability (P(|Xₙ|>ε) = 1/n → 0), yet Var(Xₙ) = n·(1/n) = 1 (or can be made to diverge).
Question 2 Multiple Choice
Which scenario correctly describes how convergence in probability can fail to imply almost sure convergence?
AAlmost sure convergence requires a finite probability space; convergence in probability applies to any space
BA sequence may converge in probability to X, yet individual sample paths may not converge to X — the 'typewriter sequence' is a canonical counterexample
CAlmost sure convergence requires the sequence to be monotone; convergence in probability has no such restriction
DConvergence in probability is actually stronger than almost sure convergence, because it must hold for all ε simultaneously
The typewriter sequence is constructed on [0,1] with uniform probability: X₁ = 1_{[0,1]}, X₂ = 1_{[0,1/2]}, X₃ = 1_{[1/2,1]}, X₄ = 1_{[0,1/4]}, X₅ = 1_{[1/4,1/2]}, ... and so on, cycling through finer and finer intervals. The probability P(Xₙ = 1) → 0, so Xₙ → 0 in probability. But for every outcome ω ∈ [0,1], Xₙ(ω) = 1 infinitely often (the intervals eventually cover every point repeatedly), so Xₙ(ω) does not converge to 0 for any ω. Individual paths misbehave; only the probability of misbehavior vanishes.
Question 3 True / False
If Xₙ converges to X in probability, then for sufficiently large n, most realization of Xₙ will fall within ε of X with probability 1.
TTrue
FFalse
Answer: False
Convergence in probability means P(|Xₙ − X| > ε) → 0, not that it ever equals 0. The probability of a large deviation merely becomes arbitrarily small, not zero. Individual realizations can still land far from X; the claim is that such events become increasingly rare, not that they become impossible. Almost sure convergence requires a stronger statement: the set of sample paths that ever deviate from X by more than ε (eventually) has probability zero.
Question 4 True / False
The Weak Law of Large Numbers establishes that the sample mean converges to the true mean in probability (not almost surely).
TTrue
FFalse
Answer: True
This is correct. The WLLN states that for iid random variables with finite mean μ, the sample mean X̄ₙ → μ in probability: P(|X̄ₙ − μ| > ε) → 0 for all ε > 0. The Strong Law of Large Numbers gives the stronger almost sure convergence result, requiring the same conditions plus finite variance (or just finite first moment under certain formulations). The distinction matters: WLLN says most samples will be close to the mean; SLLN says the sample mean path almost surely converges.
Question 5 Short Answer
What is the key difference between convergence in probability and almost sure convergence, and why is the weaker notion still mathematically useful?
Think about your answer, then reveal below.
Model answer: Almost sure convergence requires that for almost every individual sample path (every outcome except a set of probability zero), Xₙ(ω) → X(ω). The sequence behaves well path-by-path. Convergence in probability only requires that the probability of any given path deviating from X by more than ε goes to zero — but some paths may still be badly behaved. Convergence in probability is weaker: almost sure convergence implies it, but not vice versa. It remains useful because (1) many important theorems (WLLN, consistency of estimators) naturally produce it, (2) it is often easier to prove, and (3) for many statistical applications — estimators converging to true parameters — this weaker guarantee is sufficient to justify procedures.
The hierarchy of convergence modes matters in probability theory: almost sure → in probability → in distribution. Knowing which mode you have tells you what properties carry through limiting arguments and which do not. Convergence in probability suffices to pass continuous functions through limits (continuous mapping theorem), for example, which makes it highly practical for asymptotic statistics.