Transcription factors are regulatory proteins with DNA-binding domains (e.g., zinc fingers, helix-turn-helix, leucine zippers) and activation domains that enhance or repress transcription. DNA-binding specificity depends on contacts between amino acids and bases in the major groove, allowing recognition of short consensus sequences.
Study the structure of different DNA-binding domains and how amino acids contact DNA bases. Understand how mutations in DNA-binding domains alter specificity or binding affinity. Relate structure to function in a model transcription factor.
You already know that eukaryotic transcription begins when the general transcription machinery — including TFIID and its TATA-binding protein — assembles at the promoter to position RNA polymerase II. But general transcription factors alone produce only a low basal level of transcription. The real control comes from regulatory transcription factors — proteins that bind to specific DNA sequences at enhancers, silencers, and proximal promoter elements, and either boost or suppress transcription from a distance. These regulatory factors are what make a liver cell express albumin while a neuron expresses synapsin, even though both cells carry the same DNA.
Every transcription factor has at least two functional regions: a DNA-binding domain that recognizes a specific short DNA sequence, and an activation or repression domain that communicates with the transcriptional machinery or chromatin-modifying complexes. The DNA-binding domain is where structural biology meets gene regulation. Several major structural motifs have evolved independently to solve the problem of reading DNA sequence. Zinc finger domains use zinc ions to stabilize small protein loops that each contact about three base pairs in the major groove, and multiple fingers can be strung together to read longer sequences. Helix-turn-helix motifs insert one alpha helix — the recognition helix — into the major groove, where amino acid side chains make hydrogen bonds and van der Waals contacts with exposed edges of base pairs. Leucine zipper and helix-loop-helix domains work as dimers: two protein chains interlock via hydrophobic residues (leucines or other hydrophobic amino acids) and then splay apart into a fork whose basic regions grip the DNA.
The specificity of DNA binding depends on the precise fit between amino acid side chains and the pattern of hydrogen bond donors and acceptors presented by base pairs in the major groove. Each base pair (A-T, T-A, G-C, C-G) displays a unique chemical signature in the major groove, and the recognition helix or zinc finger loop is shaped to complement a particular short sequence — typically 4 to 8 base pairs for a single domain. However, most individual binding sites are too short to be unique in a large genome. Transcription factors achieve target selectivity through combinatorial strategies: they bind as dimers or higher-order complexes, they cooperate with other factors at composite elements, and the chromatin accessibility of potential binding sites restricts which sequences are available in any given cell type.
A critical point is that the same transcription factor can activate one gene and repress another, depending on its binding partners and the regulatory context. For example, a factor that recruits a histone acetyltransferase at one promoter might recruit a histone deacetylase at another, depending on which cofactors are present. This context-dependence is what allows a relatively small number of transcription factors — roughly 1,500 in the human genome — to generate the vast complexity of cell-type-specific gene expression programs.