Molecular cloning encompasses strategies for inserting, amplifying, and expressing genes of interest in host organisms. A genomic library stores randomly fragmented chromosomal DNA in vectors; a cDNA library stores reverse-transcribed mRNA and captures only expressed genes. Expression vectors include regulatory elements (promoter, ribosome-binding site, terminator) that drive transcription and translation of the cloned gene in the host. Site-directed mutagenesis uses PCR-based approaches to introduce specific mutations into cloned sequences, enabling structure-function analysis of proteins. Gateway and Gibson assembly techniques have modernized cloning by enabling scarless, sequence-independent joins.
Compare the use cases for genomic vs. cDNA libraries: when would you want introns present vs. absent? Design a cloning strategy for expressing a mammalian protein in bacteria and identify the vectors, promoters, and selectable markers needed.
From recombinant DNA technology, you know how to cut DNA with restriction enzymes, join fragments with ligase, and introduce recombinant molecules into host cells. Molecular cloning builds on these fundamentals to accomplish a specific goal: isolating, amplifying, and often expressing a particular gene or DNA sequence of interest. The core workflow is conceptually simple — insert your DNA into a self-replicating vector, put the vector into a host cell, and let the host's replication machinery make billions of copies for you.
The first major decision is what kind of library to construct. A genomic library is made by fragmenting an organism's entire genome with restriction enzymes or mechanical shearing, then inserting every fragment into vectors. This library contains everything — exons, introns, regulatory regions, repetitive elements — and is essential when you need to study gene structure, regulatory sequences, or non-coding DNA. A cDNA library takes a fundamentally different approach: start with mRNA (which represents only the genes being expressed), use reverse transcriptase to convert it to complementary DNA (cDNA), and clone that. Because mRNA has already been spliced, cDNA clones lack introns. This matters enormously when your goal is to express a eukaryotic gene in bacteria, which cannot splice introns. A cDNA library also gives you a snapshot of which genes are active in a particular tissue or condition.
The second major decision involves the vector. A simple cloning vector (like pUC19) carries a selectable marker (antibiotic resistance), an origin of replication, and a multiple cloning site — it is sufficient for propagating DNA but will not express the cloned gene as protein. An expression vector adds a strong promoter, a ribosome-binding site (Shine-Dalgarno in bacteria or Kozak sequence in eukaryotes), and a transcription terminator. Some expression vectors include tags (His-tag, GST-tag) that fuse to the protein product and simplify purification. Matching the expression system to your protein is critical: a bacterial expression system is fast and cheap but cannot perform eukaryotic post-translational modifications like glycosylation; yeast, insect cell, or mammalian expression systems are slower but produce properly modified proteins.
Modern cloning has moved well beyond the cut-and-paste approach of restriction enzymes and ligase. Gibson assembly joins multiple DNA fragments with overlapping ends in a single isothermal reaction using an exonuclease, polymerase, and ligase — no restriction sites needed. Gateway cloning uses site-specific recombination (att sites) to shuttle a gene from an entry clone into any destination vector without re-cloning. Site-directed mutagenesis uses PCR with mismatched primers to introduce specific point mutations, insertions, or deletions into a cloned gene, enabling precise structure-function analysis — you can change a single amino acid in a protein and test the functional consequence. These tools collectively make molecular cloning not just a method for copying DNA, but a flexible engineering platform for building, modifying, and expressing genes to answer biological questions.