Below approximately what sequence identity does homology modeling become unreliable, and why?
ABelow 90% — even small sequence differences make modeling impossible
BBelow approximately 25-30% — at this level, sequences may not be reliably alignable, and structural divergence (especially in loops and at the protein surface) becomes large enough that the template no longer accurately represents the target's structure
CBelow 50% — this is a hard cutoff for all proteins
DSequence identity is irrelevant for homology modeling
The 25-30% identity threshold (the 'twilight zone') reflects two compounding problems: first, sequence alignments become unreliable (insertions, deletions, and substitutions make it uncertain which residues correspond), producing alignment errors that propagate into structural errors. Second, even if the alignment is correct, proteins at this divergence level have typically diverged in loop conformations, surface features, and sometimes core packing. The backbone RMSD between proteins with 30% identity is typically 1.5-2.0 Angstroms, with much larger deviations in loops. Below 20% identity ('midnight zone'), even fold recognition becomes uncertain, and homology modeling is essentially guesswork without additional information.
Question 2 True / False
The main source of error in homology models comes from inaccurate side chain placement.
TTrue
FFalse
Answer: False
While side chain placement contributes to error (especially for non-conserved residues), the main source of error in homology models is loop modeling and alignment errors. Loops — the regions connecting secondary structure elements — diverge rapidly during evolution and cannot be accurately modeled from the template because they typically differ in length and conformation. Alignment errors (incorrectly matching target residues to template residues) produce systematic shifts in the model that affect large portions of the structure. Core backbone regions and conserved secondary structures are usually well-modeled; it is the variable loops and alignment-ambiguous regions that limit model accuracy.
Question 3 True / False
AlphaFold has made homology modeling obsolete. Understanding homology modeling principles is no longer important.
TTrue
FFalse
Answer: False
AlphaFold has dramatically improved structure prediction accuracy, but understanding homology modeling remains important for several reasons: (1) interpreting confidence scores — AlphaFold's pLDDT score and predicted aligned error are best understood through the lens of homology modeling difficulties (loops and domains with no homologs have low confidence for the same reasons they are hard to model by homology). (2) Understanding model limitations — homology modeling principles explain why certain regions of any predicted structure are unreliable. (3) Multi-template modeling and protein engineering — designing mutations, insertions, or chimeric proteins requires understanding structure-sequence relationships that homology modeling formalizes. (4) Validation — assessing whether a predicted structure is reasonable uses the same tools and principles developed for homology models.