Questions: Survival Analysis and Event History Methods
5 questions to test your understanding
Score: 0 / 5
Question 1 Multiple Choice
A study tracks how long it takes laid-off workers to find new employment, ending after 2 years. Some workers have not found employment by the study's end. A researcher proposes excluding these workers since the event was never observed. Why is this a problem?
AIt reduces the sample size too much, making the model statistically underpowered
BIt introduces selection bias — censored observations contribute information that the subject survived at least 2 years without finding employment, and discarding that information biases results downward
CIt violates the proportional hazards assumption required by the Cox model
DIt prevents estimation of time-varying covariates since those workers' employment status was never resolved
Censored observations are not useless — they are informative about survival up to the censoring point. A worker still unemployed after 2 years tells us their job-finding time exceeds 2 years, which is real data. Excluding them discards that information and biases the analysis by leaving only shorter spells in the sample, making job-finding appear faster than it is. Survival analysis handles censoring correctly by treating these observations as contributing information up to their censoring time without assuming what would have happened afterward.
Question 2 Multiple Choice
The hazard function h(t) is best interpreted as:
AThe probability that the event has occurred by time t — the cumulative incidence at that point
BThe probability that the subject survives beyond time t without experiencing the event
CThe instantaneous rate of event occurrence at time t, conditional on having survived to that point
DThe expected time until the event occurs, given covariate values measured at baseline
The hazard function is a conditional instantaneous rate: how quickly is the event occurring right now, among those who haven't experienced it yet? This conditioning on survival is what makes it distinct from simple probability. Option A describes roughly the CDF; option B describes the survival function S(t). The hazard function can vary over time — divorce risk is highest in early marriage and around the 'seventh year'; political regime vulnerability varies across regime age — capturing patterns that a single regression coefficient cannot represent.
Question 3 True / False
Standard linear regression is well-suited for analyzing the timing of events like divorce or job transitions, provided time is included as a predictor variable.
TTrue
FFalse
Answer: False
Linear regression cannot handle censoring correctly. When subjects haven't experienced the event by the observation window's end, their event time is unknown — it is 'at least X years,' not a specific value. Including their censoring time as if it were a complete observation creates bias; excluding them discards information. Beyond censoring, regression models the level of an outcome, not the instantaneous rate of event occurrence, which may vary over time in ways a single coefficient cannot capture. Survival analysis was developed specifically to address these problems.
Question 4 True / False
A hazard ratio of 2 in a Cox proportional hazards model means that the group with that characteristic experiences the event at twice the rate of the reference group at any given point in time, assuming the proportional hazards assumption holds.
TTrue
FFalse
Answer: True
This is the correct interpretation of a Cox hazard ratio. Unlike regression coefficients (which describe differences in levels) or odds ratios (which describe odds), hazard ratios describe the ratio of instantaneous event rates between groups at each moment in time. The proportional hazards assumption states that this ratio remains constant across time — the two groups' hazard functions are parallel on a log scale. Violating this assumption means the hazard ratio changes over time, requiring extensions like time-varying coefficients or stratified models.
Question 5 Short Answer
What is censoring in the context of event history analysis, and why does it require a different analytical approach than standard regression?
Think about your answer, then reveal below.
Model answer: Censoring occurs when a subject is observed for a period but the event of interest has not occurred by the end of observation — for example, a couple still married when a divorce study ends. The event time is unknown; it is only known to exceed the observation period. Standard regression requires a complete outcome value and cannot incorporate this partial information correctly — either excluding censored cases (biasing estimates) or imputing their event time (introducing error). Survival analysis handles censoring by allowing each observation to contribute information about survival up to its censoring point without making any assumption about what would have happened afterward. The likelihood function is constructed to correctly weight complete and censored observations, extracting the maximum information from incomplete data.
Censoring is the fundamental challenge that motivates the entire survival analysis framework. Once you understand it, the survival function, hazard function, and Cox model all follow naturally as tools for extracting information from data where some event times are observed and others are only bounded from below.