Toward Conditional Distribution Calibration in Survival Prediction

Authors: Shi-ang Qi, Yakun Yu, Russell Greiner

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We provide asymptotic theoretical guarantees for both marginal and conditional calibration and test it extensively across 15 diverse real-world datasets, demonstrating the method s practical effectiveness and versatility in various settings.
Researcher Affiliation Academia Shi-ang Qi 1, Yakun Yu 2, Russell Greiner 1 3 1Computing Science, University of Alberta, Edmonton, Canada 2Electrical Computer Engineering, University of Alberta, Edmonton, Canada 3Alberta Machine Intelligence Institute, Edmonton, Canada
Pseudocode Yes The pseudo-code for implementing the Ci POT process with censoring is outlined in Algorithm 1 in Appendix B.
Open Source Code Yes The implementation of Ci POT method, worst-slab distribution calibration score, and the code to reproduce all experiments in this section are available at https://github.com/shi-ang/ Make Survival Calibrated Again.
Open Datasets Yes We use 15 datasets to test the effectiveness of our method. Table 3 in Appendix E.1 summarizes the dataset statistics, and Appendix E.1 also contains details of preprocessing steps, KM curves, and histograms of event/censor times.
Dataset Splits Yes We divided the data into a training set (90%) and a testing set (10%) using a stratified split to balance time ti and censor indicator δi. We also reserved a balanced 10% validation subset from the training data for hyperparameter tuning and early stopping.
Hardware Specification No NA
Software Dependencies No It is implemented in lifelines packages [59].
Experiment Setup Yes Full hyperparameter details for NN-Based survival baselines In the experiments, all neural network-based methods (including N-MTLR, Deep Surv, Deep Hit, Cox Time, and CQRNN) used the same architecture and optimization procedure. Training maximum epoch: 10000 Early stop patients: 50 Optimizer: Adam Batch size: 256 Learning rate: 1e-3 Learning rate scheduler: Cosine Annealing LR Learning rate minimum: 1e-6 Weight decay: 0.1 NN architecture: [64, 64] Activation function: ReLU Dropout rate: 0.4 Full hyperparameter details for CSD and Ci POT Interpolation: {Linear, PCHIP} Extrapolation: Linear Monotonic method: {Ceiling, Flooring, Booststraping} Number percentile: {9, 19, 39, 49} Conformal set: {Validation set, Training set + Validation set} Repetition parameter: {3, 5, 10, 100, 1000}