Questions: Single Nucleotide Polymorphisms and Genetic Variation
5 questions to test your understanding
Score: 0 / 5
Question 1 Multiple Choice
A genome-wide association study identifies a tag SNP that occurs significantly more often in people with Type 2 diabetes than in healthy controls. What is the correct interpretation?
AThe tag SNP itself directly causes Type 2 diabetes by altering a protein involved in glucose metabolism
BSomething in the chromosomal region surrounding the tag SNP likely contributes to Type 2 diabetes risk — the tag SNP is a marker, not necessarily the causal variant
CPeople with the tag SNP will definitely develop Type 2 diabetes because the SNP is disease-causing
DThe association is spurious — SNPs are neutral by definition and cannot be associated with disease
A tag SNP is a marker in linkage disequilibrium with a block of nearby variants. It flags a chromosomal neighborhood, not a specific causal variant. If the tag SNP is associated with Type 2 diabetes, the actual functional variant affecting disease risk is likely somewhere in the same LD block — it could be the tag SNP itself, a nearby coding SNP, a regulatory variant, or an insertion/deletion. GWAS identifies genomic regions worth investigating, not confirmed causal variants. The key insight is that tag SNPs are signposts pointing to regions, not destinations with known functional effects.
Question 2 Multiple Choice
Why are most SNPs in the human genome neutral — having no detectable effect on fitness or phenotype?
ABecause SNPs are always located in repetitive DNA regions that have no function
BBecause the mutation rate is too low for SNPs to affect protein-coding sequences
CBecause most SNPs fall in intergenic regions or in synonymous codon positions, where base changes do not alter amino acid sequences or gene regulation
DBecause natural selection has eliminated all SNPs that could affect phenotype
The human genome is mostly non-coding: roughly 98% of the sequence lies outside protein-coding exons. SNPs that fall in intergenic regions have no direct effect on protein sequence. Among SNPs within genes, many land in the third position of codons, where the genetic code's redundancy (wobble) means many base changes code for the same amino acid — these synonymous or silent SNPs change the DNA letter without changing the protein. Only a small minority of SNPs are nonsynonymous (amino acid-changing), and an even smaller fraction meaningfully affect function. Neutrality is the default expectation for a random base change in a large genome.
Question 3 True / False
In genome-wide association studies, a tag SNP is useful because genotyping it surveys the genetic variation of the entire linkage disequilibrium block surrounding it, not just the tag SNP's own variant.
TTrue
FFalse
Answer: True
This is the core principle that makes GWAS practical. Because nearby positions on a chromosome tend to be inherited together (they are in linkage disequilibrium), a tag SNP that correlates with the other variants in its LD block serves as a proxy for all of them. Genotyping arrays that include well-chosen tag SNPs can therefore survey hundreds of thousands of common variants across the genome without needing to directly genotype every single SNP. If the tag SNP shows disease association, the entire LD block — potentially containing dozens of variants — is implicated and can be fine-mapped.
Question 4 True / False
If a SNP is identified in a GWAS as strongly associated with a disease, it is expected to be a nonsynonymous coding SNP that alters protein function.
TTrue
FFalse
Answer: False
GWAS associations do not require the identified SNP (or even the causal variant in the same LD block) to be coding. Many GWAS hits map to intronic or intergenic regions, where the functional variant may affect gene regulation — a promoter element, an enhancer, a splice site — rather than the amino acid sequence of a protein. The tag SNP itself is often not in a coding region; it simply flags a chromosomal neighborhood. Fine-mapping and functional follow-up are required to identify the actual causal variant and its mechanism. Assuming GWAS hits must be nonsynonymous coding variants is one of the most common misinterpretations of the field.
Question 5 Short Answer
What is linkage disequilibrium, and why does it make SNPs useful as genetic markers even when the SNPs themselves are not the functional variants of interest?
Think about your answer, then reveal below.
Model answer: Linkage disequilibrium (LD) is the tendency for alleles at nearby positions on a chromosome to be inherited together — they are correlated because recombination rarely separates adjacent positions in a few generations. This creates blocks of variants that travel through populations as units. A tag SNP that is in high LD with a functional variant will show up more often in people who carry the functional variant, making the tag SNP a reliable proxy for the functional variant. GWAS exploits this: by genotyping a well-chosen set of tag SNPs, researchers can survey millions of common variants across the genome at lower cost, because each tag SNP represents an entire LD block of variants, not just itself.
The practical consequence is that SNPs are most valuable not for what they individually do, but for what they point to. A SNP with no functional consequence can still be an excellent marker for a nearby variant that does have functional consequence, if the two are in high LD. This is why the key conceptual shift is from thinking of SNPs as individual mutations with individual effects to thinking of them as a dense coordinate system for mapping the genome. Their value is geographic: they mark locations and allow researchers to narrow down the search space for causal variants.