Statistical Modeling
Survival Analysis for Time-to-Event Modeling
Analysis of time-to-event data to estimate risk, compare groups and support prioritization decisions.
Dataset type: public. No confidential client or employer data.
Executive Summary
This case shows how time-to-event modeling can estimate risk over time, compare groups and handle censored observations without reducing the problem to a fixed-window classification task.
Business Question
Which groups have higher event risk over time, and when should an intervention or follow-up be prioritized?
Statistical Question / Hypothesis
The analysis tests whether survival curves or hazard rates differ between groups after accounting for censoring and relevant covariates.
Dataset
The dataset is public and includes start dates, event indicators, event or censoring times and group-level covariates. Records without an observed event are retained as censored observations.
Methodology
Kaplan-Meier curves are used for non-parametric group comparison. Cox proportional hazards models estimate covariate-adjusted hazard ratios, with diagnostic checks for proportional hazards assumptions.
Implementation
Python and R are used to prepare event windows, calculate survival curves, fit Cox models, check assumptions and summarize risk differences for non-technical stakeholders.
Results
The analysis reports median survival where estimable, survival probability at decision-relevant time points and hazard ratios with uncertainty intervals.
Limitations
Limitations include informative censoring, unobserved confounders, non-proportional hazards and interpretation risk when event definitions are inconsistent.
Executive Recommendation
Use estimated time-dependent risk to prioritize actions earlier for high-risk groups and avoid fixed-window decisions that ignore censoring.
Tools Used
Python and R.
Links
Notebook, GitHub repository and executive PDF are coming soon.