Primary structure is the linear sequence of amino acids in a polypeptide chain, determined by the genetic code and synthesized by the ribosome from mRNA. The primary structure uniquely identifies a protein and, through the folding information encoded in amino acid side chains, determines all higher levels of protein organization. Small changes in primary structure (missense mutations, post-translational modifications) can dramatically alter protein function.
Study the genetic code and practice translating mRNA sequences into amino acid sequences. Compare wild-type and mutant proteins (e.g., hemoglobin vs sickle-cell hemoglobin) to see how single amino acid changes propagate through higher structures.
You already know that amino acids are joined by peptide bonds — the covalent amide linkages formed between the carboxyl group of one amino acid and the amino group of the next. A protein's primary structure is simply the complete, ordered sequence of amino acids in the polypeptide chain, read from the amino terminus (N-terminus) to the carboxyl terminus (C-terminus). This sequence is not random; it is dictated by the nucleotide sequence of the gene that encodes the protein, translated codon by codon on the ribosome. Every copy of a given protein produced from the same gene has the identical primary structure.
Why does the sequence matter so much? Because the identity and order of amino acid side chains determine everything that happens next. Each of the 20 common amino acids has a distinct side chain — some hydrophobic, some charged, some polar, some bulky, some small. As the polypeptide chain emerges from the ribosome, these side chains begin interacting with each other and with the surrounding water. Hydrophobic side chains are driven inward away from water, charged residues form salt bridges, hydrogen bonds form between polar groups, and the chain folds into the specific three-dimensional shape that gives the protein its function. Change even one amino acid, and you change the local chemistry at that position — potentially disrupting a critical interaction.
The most famous example is sickle-cell hemoglobin. Normal adult hemoglobin (HbA) has a glutamic acid at position 6 of the β-globin chain. In sickle-cell hemoglobin (HbS), a single nucleotide mutation replaces that glutamic acid with valine — swapping a charged, hydrophilic residue for a hydrophobic one. This single change in primary structure creates a sticky hydrophobic patch on the protein surface that causes hemoglobin molecules to polymerize into rigid fibers under low-oxygen conditions, distorting red blood cells into the characteristic sickle shape. One amino acid out of 146, and the entire behavior of the protein — and the health of the individual — is transformed.
Primary structure is also the level at which proteins can be identified and compared across species. Because the genetic code is nearly universal, sequence comparison reveals evolutionary relationships: proteins that share significant sequence similarity (homology) almost certainly descended from a common ancestral gene. Techniques like Edman degradation (which sequentially removes and identifies amino acids from the N-terminus) and modern mass spectrometry allow researchers to determine primary structure experimentally, while DNA sequencing provides it indirectly through the genetic code. Understanding primary structure is the foundation for all of protein biochemistry — every question about how a protein folds, functions, or fails begins with its sequence.