The Cox proportional hazards model relates the hazard (instantaneous event rate) to covariates without specifying the baseline hazard function: h(t|X) = h_0(t) × exp(beta_1*x_1 + ... + beta_k*x_k). This semi-parametric structure separates the time dependence (absorbed into the unspecified h_0(t)) from the covariate effects (the exponential term). Exponentiated coefficients exp(beta_j) are hazard ratios — the multiplicative change in the instantaneous event rate per unit increase in x_j. The proportional hazards assumption requires that these hazard ratios remain constant over time: the curves for different covariate values can never cross on the hazard scale. Estimation uses partial likelihood, which depends only on the ordering of event times and eliminates h_0(t), making the model remarkably flexible yet powerful.
The Kaplan-Meier estimator and log-rank test compare survival between groups but cannot adjust for multiple covariates simultaneously. If Treatment A enrolls older, sicker patients, the unadjusted KM comparison is confounded. The Cox proportional hazards model solves this by relating the hazard to multiple covariates through a multiplicative model: h(t|X) = h_0(t) × exp(Xβ). This is to survival analysis what multiple regression is to continuous outcomes — it allows you to estimate the independent effect of each variable while controlling for others.
The model's defining feature is its semi-parametric structure. The baseline hazard h_0(t) — which captures how the overall event rate changes with time — is left completely unspecified. All the parametric assumptions are in the covariate effects: the exponential term exp(Xβ) multiplies the baseline hazard by a constant factor that depends on the patient's characteristics but not on time. This means the model assumes that the ratio of hazards for any two patients remains constant throughout follow-up. If Patient A has twice the hazard of Patient B at 1 year, the model requires this ratio to hold at 5 years and 10 years as well. This is the proportional hazards assumption.
Estimation uses partial likelihood, a concept introduced by Cox in his landmark 1972 paper. The key insight is that the covariate effects can be estimated from the event ordering alone. At each event time, consider all subjects still at risk. The probability that the specific subject who experienced the event is the one who did depends on the relative hazards exp(Xβ) across all subjects at risk — and the baseline hazard cancels out of this conditional probability because it multiplies both numerator and denominator. The partial likelihood is the product of these conditional probabilities across all event times. Maximizing it yields β estimates without ever estimating h_0(t). If h_0(t) is needed (for predicted survival curves), it can be recovered afterward using the Breslow estimator.
Checking the proportional hazards assumption is essential. If the assumption fails — say, a new drug reduces early mortality but its effect wanes with time — the hazard ratio is not constant, and the Cox model produces a single hazard ratio that averages over time in a potentially misleading way. Diagnostics include plotting Schoenfeld residuals against time (a trend indicates violation), testing the significance of a time-covariate interaction, and visually inspecting log-log survival plots (parallel curves support proportional hazards). When the assumption fails, remedies include stratifying on the offending variable, including a time-covariate interaction, or using models that explicitly allow time-varying effects.