Conformalized Survival Distributions: A Generic Post-Process to Increase Calibration

Authors: Shi-Ang Qi, Yakun Yu, Russell Greiner

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We provide theoretical guarantees for the above claim, and rigorously validate the efficiency of our approach across 11 real-world datasets, showcasing its practical applicability and robustness in diverse scenarios. Section 4 presents the extensive empirical analysis across 11 real-world survival datasets and shows the effectiveness of CSD.
Researcher Affiliation Academia 1Computing Science, University of Alberta, Edmonton, Canada 2Eletrical Computer Engineeering, University of Alberta, Edmonton, Canada 3Alberta Machine Intelligence Institute, Edmonton, Canada.
Pseudocode Yes The pseudo-algorithm for computing the CSD with the KM-sampling process is presented at Algorithm 1 in Appendix D.2.
Open Source Code Yes A Python implementation of CSD is available online at https://github.com/shi-ang/CSD, along with code to replicate all experiments.
Open Datasets Yes The Veterans Administration Lung Cancer Trial (VALCT) dataset (Kalbfleisch & Prentice, 2011) is derived from a randomized trial comparing two treatment regimens for lung cancer... The Molecular Taxonomy of Breast Cancer International Consortium (METABRIC) dataset (Curtis et al., 2012) contains survival information for breast cancer patients... Surveillance, Epidemiology, and End Results (SEER) Program dataset (National Cancer Institute, DCCPS, Surveillance Research Program, 2015) is a comprehensive collection of data on cancer patients in the United States.
Dataset Splits Yes We split each dataset into a training set (90%) and a testing set (10%) using a stratified splitting procedure that balances both the time ti and the censor indicator δi. For algorithms that require a validation set to tune hyperparameters or to early stop, we partition another balanced 10% validation set from the training set.
Hardware Specification No The paper describes the software models and experimental setup, but it does not specify any particular hardware components such as CPU or GPU models, or detailed cloud computing specifications used for running the experiments.
Software Dependencies No The paper mentions several software packages like "lifelines packages (Davidson Pilon, 2024)", "scikit-survival packages (P olsterl, 2020)", "torchmtlr", and "pycox packages (Kvamme et al., 2019)", but does not provide explicit version numbers for these libraries or the programming language (Python).
Experiment Setup Yes For the training process, we utilize Adam optimizer combined with an L2 penalty for weight decay to fine-tune the models. The learning parameters are set as follows: a learning rate of 0.001, a batch size of 256, and a dropout rate of 0.4. Additionally, we implement an early stopping mechanism across all deep learning models, which is based on performance validation using a separate validation dataset.