Two researchers study the same population. Researcher A simulates all 10,000 individuals forward for 1,000 generations. Researcher B uses the coalescent to trace the ancestry of 50 sampled gene copies backward in time. Which claim about Researcher B's approach is most accurate?
AResearcher B's approach is less accurate because it ignores most individuals in the population
BResearcher B's approach is more computationally efficient because it models only the ancestry of the actual sample, discarding irrelevant lineages
CResearcher B's approach requires knowing the full genealogy of all 10,000 individuals before it can begin
DResearcher B's approach is only valid if the population has constant size over time
The computational power of coalescent theory is that it ignores the vast majority of individuals in the population who left no ancestors in the current sample. Forward simulations must track all individuals; most of their lineages die out before the present and contribute nothing to the sample's genealogy. The coalescent skips all that wasted computation, modeling only the ancestry of the sampled genes. This is not a loss of accuracy — it is a more efficient parameterization of the same probability model.
Question 2 Multiple Choice
A population experienced a severe bottleneck (dramatic reduction in size) several thousand generations ago. What signature in the gene tree of a present-day sample would indicate this bottleneck?
ALong, evenly spaced branches throughout the tree indicating slow, steady coalescence at all times
BA burst of coalescent events concentrated in the period of small population size, with most lineages merging during that narrow window
CNo effect — coalescence rate depends only on sample size, not effective population size
DLineages that fail to coalesce at all, creating an unresolved polytomy at the root
Coalescence rate is k(k−1)/(4Ne): when Ne is small, lineages coalesce rapidly. A bottleneck creates a narrow window of very small Ne through which many lineages must pass, causing a burst of coalescent events clustered in time. This is visible in the gene tree as a star-like cluster of nodes at the depth corresponding to the bottleneck. A population expansion has the opposite signature: lineages persist independently for many generations, producing a tree with many long, parallel branches that coalesce only near the root.
Question 3 True / False
In coalescent theory, the expected time for two randomly sampled gene copies to coalesce to their common ancestor increases with effective population size (Ne).
TTrue
FFalse
Answer: True
The probability that two gene copies coalesce in a single generation is 1/(2Ne). The expected waiting time is 2Ne generations. A larger population means any two copies are less likely to share the same parent in any given generation, so coalescence takes longer on average. This is the direct connection between Ne and the branch lengths in gene trees — larger populations produce longer branches and more ancient common ancestors.
Question 4 True / False
Coalescent theory traces the evolution of an entire population forward in time, predicting which lineages will survive to the present generation.
TTrue
FFalse
Answer: False
This describes forward-time population simulation, which is the opposite of coalescent theory. Coalescent theory starts from a sample of present-day gene copies and traces their ancestry *backward* in time until all lineages converge on a single common ancestor (the MRCA). By reversing the direction of time, it avoids simulating the many lineages in the population that are irrelevant to the sample, making it dramatically more efficient for inference from genetic data.
Question 5 Short Answer
Why is coalescent theory especially efficient for analyzing genomic data compared to forward-time population simulations?
Think about your answer, then reveal below.
Model answer: Forward-time simulation must track every individual in the population through every generation, even though most lineages are irrelevant to the sample eventually observed. Coalescent theory focuses only on the ancestry of the sampled gene copies, modeling their genealogy backward in time and ignoring all population members who left no descendants in the sample. For large populations (Ne = millions) over many generations, this reduces computational cost by orders of magnitude. Additionally, the coalescent directly parameterizes the quantities of interest — coalescence times and tree topology — in terms of population genetic parameters (Ne, migration rates), making it ideal for statistical inference.
The key insight is that the coalescent is not an approximation — it is the exact probability distribution over genealogies of a sample, just computed from the sample's perspective rather than the population's perspective. The efficiency gain comes from discarding irrelevant lineages, not from sacrificing precision.