What is the key difference between the Needleman-Wunsch and Smith-Waterman algorithms?
ANeedleman-Wunsch uses a scoring matrix while Smith-Waterman does not
BNeedleman-Wunsch performs global alignment while Smith-Waterman performs local alignment
CSmith-Waterman requires protein sequences while Needleman-Wunsch works only on DNA
DNeedleman-Wunsch is heuristic while Smith-Waterman is exact
Both algorithms use dynamic programming with scoring matrices and gap penalties. The fundamental difference is scope: Needleman-Wunsch aligns sequences from end to end (global), while Smith-Waterman allows the alignment to start and end anywhere in either sequence (local). Smith-Waterman achieves this by setting a floor of zero in the scoring matrix — negative-scoring regions are ignored, so the algorithm finds the best-matching subsequence.
Question 2 True / False
In pairwise sequence alignment, affine gap penalties use a single fixed cost per gap regardless of gap length.
TTrue
FFalse
Answer: False
Affine gap penalties distinguish between gap opening (a larger penalty for initiating a new gap) and gap extension (a smaller penalty for each additional position in an existing gap). This reflects the biological observation that a single insertion/deletion event often involves multiple contiguous nucleotides or amino acids, so extending an existing gap is more likely than opening a new one. A flat per-position penalty would over-penalize long gaps and favor many short gaps, producing biologically unrealistic alignments.
Question 3 Short Answer
Why are substitution scoring matrices like BLOSUM62 used for protein alignment instead of a simple match/mismatch scheme?
Think about your answer, then reveal below.
Model answer: Different amino acid substitutions have different likelihoods of occurring during evolution. Some substitutions (e.g., leucine to isoleucine) preserve biochemical properties and are common, while others (e.g., glycine to tryptophan) are rare because they disrupt protein function. BLOSUM62 encodes these empirically observed substitution frequencies as log-odds scores, so a chemically conservative substitution scores higher than a radical one. A simple match/mismatch scheme ignores this biochemical context and treats all mismatches as equally bad.
BLOSUM matrices are derived from ungapped blocks of aligned protein sequences at a specified percent identity threshold (62% for BLOSUM62). Each matrix entry represents the log-odds ratio of observing a given substitution in related sequences versus by chance. This makes the alignment biologically informed rather than purely combinatorial.
Question 4 Short Answer
You align a 300-residue protein against a 250-residue protein using Smith-Waterman and get a local alignment covering only 80 residues. What does this suggest about the relationship between the two proteins?
Think about your answer, then reveal below.
Model answer: The proteins likely share a conserved domain or functional motif spanning roughly 80 residues, but differ substantially outside that region. They may be multi-domain proteins that share one domain but not others, or one protein may contain a domain that the other has incorporated into a different overall architecture. The local alignment identified the region of genuine homology while ignoring the non-homologous flanking regions that would have degraded a global alignment score.
This is precisely why local alignment exists: many biologically meaningful relationships involve partial sequence similarity rather than full-length conservation. A global alignment would force the remaining ~200 residues into a poor alignment with many gaps, diluting the signal from the truly conserved region.