Questions: Phylogenetic Inference: Parsimony, Distance, and Maximum Likelihood
5 questions to test your understanding
Score: 0 / 5
Question 1 Multiple Choice
Two rapidly evolving lineages — one from birds, one from lizards — have independently accumulated many convergent mutations in a gene. A maximum parsimony analysis groups them as sister taxa. Why is this result likely an artifact?
AParsimony never makes errors with molecular data — only morphological data misleads it
BLong-branch attraction: parsimony interprets convergently accumulated similarity as shared common ancestry, incorrectly grouping fast-evolving lineages together
CParsimony correctly identifies them as closely related, because similarity always reflects shared ancestry
DDistance methods would make the same error, confirming the grouping is likely correct
Long-branch attraction is a systematic failure mode of maximum parsimony when lineages evolve at very different rates. Rapidly evolving lineages independently accumulate many convergent mutations (homoplasy). Parsimony, which minimizes total changes, interprets shared similarity as evidence of common ancestry — but this similarity is convergent, not inherited from a recent common ancestor. Maximum likelihood methods can correct for this by explicitly modeling substitution probabilities at different rates, detecting that the apparent synapomorphy is more likely due to convergence than to shared ancestry.
Question 2 Multiple Choice
A researcher runs parsimony, neighbor-joining, and maximum likelihood on the same dataset. Parsimony and neighbor-joining agree, but maximum likelihood gives a different tree. What should the researcher conclude?
AThe maximum likelihood tree is wrong because two independent methods agree against it
BThe maximum likelihood tree is certainly correct because it uses the most rigorous statistical model
CThe conflict should trigger further investigation — testing model fit, checking for rate variation, or gathering more data — rather than automatically accepting the majority result
DThe three trees should be averaged to produce the best consensus estimate
Method agreement indicates robustness, but method conflict does not resolve which is correct by counting votes. Parsimony and neighbor-joining can share failure modes — both can be sensitive to rate variation and long-branch attraction — and may go wrong together under conditions that favor those biases. Maximum likelihood under an appropriate model generally outperforms both, but model misspecification can bias it. The correct response to disagreement is investigation: model selection testing, simulation, or gathering more characters. Phylogenetic practice treats method conflict as a signal that the data have limitations requiring further study.
Question 3 True / False
Distance-based phylogenetic methods cluster species by pairwise evolutionary distances computed from character data, discarding information about which specific character changes occurred.
TTrue
FFalse
Answer: True
This accurately describes both the strength and limitation of distance methods. They collapse the full character matrix into a single pairwise distance for each species pair (typically the fraction of differing sites, corrected for multiple substitutions), then apply clustering algorithms like neighbor-joining. Advantages include computational speed for large datasets. The limitation is information loss: two very different patterns of change can produce identical distances, discarding phylogenetic signal that parsimony and likelihood methods retain by examining individual characters.
Question 4 True / False
Bayesian phylogenetics produces a single best-supported tree, just like maximum likelihood, and is distinguished primarily by being computationally more efficient.
TTrue
FFalse
Answer: False
Bayesian phylogenetics differs fundamentally from maximum likelihood in both output and computation. Rather than returning a single tree that maximizes the likelihood, Bayesian inference samples from the posterior distribution of trees using MCMC — producing a set of sampled trees summarized as a consensus with posterior probability support values at each node, directly quantifying uncertainty. Maximum likelihood reports bootstrap support, a resampling measure, not a true probability. Computationally, Bayesian methods are typically more demanding than ML, not more efficient.
Question 5 Short Answer
Why do modern phylogenetic studies typically run multiple inference methods rather than selecting the 'best' one, and what do they look for in the results?
Think about your answer, then reveal below.
Model answer: Each method makes different assumptions and has characteristic failure modes: parsimony fails under long-branch attraction and rate variation; distance methods lose information by collapsing characters to pairwise numbers; ML and Bayesian methods can be misled by incorrect substitution models. No single method is universally optimal. By running multiple methods, researchers use convergence as a confidence signal: a node supported by parsimony, distance, and ML/Bayesian methods is considered robustly inferred, because independent approaches with different assumptions reach the same conclusion. Disagreements flag nodes where data are insufficient, rates are heterogeneous, or model assumptions may be violated — guiding decisions about whether to gather more data or refine the analysis.
The multi-method approach applies triangulation: convergence of independent evidence justifies confidence; divergence signals where inference is fragile and further investigation is needed.