What does a high Fst value at a particular genomic locus indicate?
AThe locus has a high mutation rate
BAllele frequencies at that locus differ substantially between the compared populations
CThe locus is essential for survival
DThe locus is located in a repetitive region
Fst (fixation index) measures the proportion of genetic variance that is attributable to differences between populations rather than within them. An Fst of 0 means allele frequencies are identical across populations; an Fst of 1 means populations are fixed for different alleles. A high Fst at a specific locus, relative to the genome-wide background, suggests that divergent natural selection may have driven allele frequency differences at that locus — though genetic drift after a bottleneck can also produce outlier Fst values.
Question 2 True / False
PCA of genome-wide SNP data separates individuals into clusters that correspond to biologically distinct human races.
TTrue
FFalse
Answer: False
PCA of human genetic data does show clustering that correlates with geographic ancestry, but this reflects continuous patterns of genetic variation shaped by migration, drift, and isolation by distance — not discrete biological categories. Most human genetic variation exists within populations rather than between them (Fst between continental groups is only ~0.10-0.15). The clusters in a PCA plot shift depending on which populations are sampled, and intermediate populations fill in the gaps between clusters. Human genetic variation is clinal, not categorical.
Question 3 Short Answer
Explain how genome-wide data enables detection of recent positive selection that single-locus studies would miss.
Think about your answer, then reveal below.
Model answer: Recent positive selection leaves a characteristic genomic signature: a long haplotype of reduced variation surrounding the selected allele (selective sweep), because the favored allele rose in frequency too quickly for recombination to break down its surrounding haplotype. Detecting this requires comparing haplotype lengths across many loci genome-wide to establish the background expectation, then identifying loci with unusually long haplotypes (using tests like iHS or XP-EHH). A single-locus study cannot establish this genome-wide baseline and therefore cannot distinguish a selected locus from a neutral one that happens to have low diversity.
This illustrates the power of the genome-wide approach: the genome itself provides the null distribution. Methods like iHS compare the haplotype homozygosity of the derived allele to the ancestral allele at each SNP, flagging loci where the derived allele sits on an unusually long undisrupted haplotype — the hallmark of a recent sweep.