Mathematical models of disease transmission quantify how infections spread through populations using compartmental structures (SIR: susceptible, infected, recovered). Transmission rate, recovery rate, and contact patterns determine epidemic growth. These models predict epidemic trajectory, estimate basic reproduction number (R₀), and evaluate the impact of interventions like vaccination and isolation.
Start with simple SIR models by hand, then use R or Python to simulate scenarios. Compare predictions to real outbreak data (e.g., COVID-19, influenza) to see how well models perform.
From your study of epidemic curves, you learned to read outbreak data — the shape of a curve tells you whether transmission is accelerating, peaking, or declining. Mathematical modeling takes the next step: instead of describing what happened, it tries to explain *why* it happened and predict what *would* happen under different conditions. The fundamental tool is the SIR model, a compartmental framework that divides a population into three mutually exclusive groups at any point in time: Susceptible (no immunity, can be infected), Infected (currently infectious), and Recovered (immune, no longer infectious). The epidemic is then a flow problem — how fast do people move between these compartments?
The flow rates are governed by two parameters. The transmission rate (β) is the per-day probability that a susceptible person becomes infected, which depends on the rate of contact between susceptible and infected individuals and the probability of transmission per contact. The recovery rate (γ) is the per-day rate at which infected individuals recover (the reciprocal of the average infectious period). From these two parameters emerges the single most important quantity in epidemic theory: the basic reproduction number R₀ = β/γ. R₀ is the average number of secondary infections generated by one infectious individual in a fully susceptible population. When R₀ > 1, each case produces more than one new case on average and the epidemic expands; when R₀ < 1, the chain of transmission dies out. The epidemic peaks — the apex of the curve you studied — occurs precisely when the fraction of the population still susceptible falls to 1/R₀, pushing the effective reproduction number below 1.
The SIR model makes this dynamic explicit through differential equations. The rate of new infections is proportional to β × S × I (the product of contact opportunity and the number of infectious individuals) and falls as the susceptible pool depletes. This explains the characteristic epidemic curve shape: exponential growth while most of the population is susceptible, followed by deceleration as immunity accumulates, and eventual decline. The herd immunity threshold — the fraction of the population that must be immune (naturally or through vaccination) to prevent sustained transmission — is simply 1 − 1/R₀. For measles (R₀ ≈ 15), this threshold is about 93%; for COVID-19 (R₀ ≈ 2–3 in original form), around 50–67%.
Models become genuinely useful for comparing interventions. By adjusting β (through social distancing, masking, or isolation — which reduce contact rate) or γ (through treatment that shortens infectious period), or by moving individuals directly from S to R (vaccination), you can simulate the epidemic trajectory under each scenario and compare outcomes. This is how public health agencies evaluate "what if we vaccinate 60% before the peak" versus "what if we implement a two-week lockdown." The model does not predict the future with precision, but it provides a structured framework for comparing the *relative* impact of interventions on a shared set of assumptions — far more useful than intuition alone.
Two common extensions beyond the basic SIR model address important real-world complications. SEIR models add an Exposed (E) compartment for individuals who are infected but not yet infectious (the incubation period) — critical for diseases like COVID-19 where this latent period substantially shapes early dynamics. Age-structured models account for the fact that contact rates and susceptibility differ dramatically by age — children have more school contacts, elderly have more severe outcomes. Each extension adds realism but also adds parameters that must be estimated from data, introducing uncertainty. The discipline of epidemic modeling is therefore as much about honest uncertainty quantification as it is about the models themselves.