Viruses are classified by genome type (dsDNA, ssDNA, dsRNA, ssRNA), polarity (positive or negative sense), and structure (enveloped or non-enveloped). The Baltimore classification groups viruses by replication strategy. Viral genomes range from <4 kb (satellite RNAs) to >1 Mb (giant viruses), determining replication complexity and host interactions.
You already understand that viruses consist of a nucleic acid genome packaged inside a protein capsid, and you know the basic steps of the replication cycle — attachment, entry, replication, assembly, and release. The next question is: how do we organize the staggering diversity of viruses into a coherent framework? The answer centers on the genome itself, because the type of nucleic acid a virus carries dictates how it replicates, and replication strategy is the most fundamental distinction among viruses.
The Baltimore classification system, developed by Nobel laureate David Baltimore, sorts all viruses into seven groups based on how they produce messenger RNA. Every virus must generate mRNA that the host ribosome can translate, so the path from genome to mRNA defines the virus's replication logic. Group I viruses have double-stranded DNA (dsDNA) and can use host transcription machinery almost directly — think of herpesviruses or bacteriophage T4. Group IV viruses carry positive-sense single-stranded RNA ((+)ssRNA), meaning their genome itself can serve as mRNA the moment it enters the cell — poliovirus is a classic example. Group V viruses carry negative-sense ssRNA ((−)ssRNA) and must first transcribe it into the complementary positive strand before translation can occur, which is why they must package their own RNA-dependent RNA polymerase inside the virion. Group VI retroviruses (like HIV) carry (+)ssRNA but replicate through a DNA intermediate using reverse transcriptase.
Beyond genome type, viruses are classified by structural features. The presence or absence of a lipid envelope surrounding the capsid has enormous practical consequences: enveloped viruses (influenza, SARS-CoV-2) are generally fragile outside the host and susceptible to detergents and drying, while non-enveloped viruses (norovirus, adenovirus) can persist on surfaces for days. Capsid geometry — icosahedral, helical, or complex — further subdivides groups. The combination of genome type, replication strategy, and structural features creates a multi-axis classification that reflects both evolutionary relationships and practical behavior.
Genome size correlates with biological complexity in revealing ways. The smallest viral genomes (satellite viruses, circoviruses) encode just a handful of proteins and depend heavily on host machinery or even helper viruses to replicate. Mid-sized RNA viruses are capped at roughly 30 kb because RNA polymerases lack proofreading and larger genomes would accumulate too many lethal mutations per replication cycle — coronaviruses push this limit with a rare exonuclease proofreading function. DNA viruses can support much larger genomes because DNA polymerases proofread, which is why giant viruses like Mimivirus exceed 1 Mb and encode hundreds of genes, blurring the traditional boundary between viruses and cellular life. Understanding where a virus sits in this classification immediately tells you what enzymes it must encode, what drug targets might be available, and how it will interact with the host immune system.