Differentiable sorting for censored time-to-event data.
Authors: Andre Vauvelle, Benjamin Wild, Roland Eils, Spiros Denaxas
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments reveal that Diffsurv outperforms established baselines in various simulated and real-world risk prediction scenarios. (Lines 13-14) Furthermore, we demonstrate the algorithmic advantages of Diffsurv by presenting a novel method for top-k risk prediction that surpasses current methods. (Lines 14-16) In our experiments, we aim to assess the performance of Diffsurv and compare it against the conventional Cox Partial Likelihood (CPL) methods. (Lines 245-246) |
| Researcher Affiliation | Collaboration | Andre Vauvelle12 , Benjamin Wild3 , Roland Eils3, Spiros Denaxas1 University College London1, Benevolent AI2, Berlin Institute of Health3 {andre.vauvelle.19,s.denaxas}@ucl.ac.uk {benjamin.wild, roland.eils}@bih-charite.de |
| Pseudocode | No | The paper describes algorithms and methods in text and uses a diagram (Figure 1) to illustrate the process, but it does not contain structured pseudocode or algorithm blocks with numbered steps. |
| Open Source Code | Yes | Further details on the experimental setup, including compute time, are provided in Appendix B and at https://github.com/andre-vauvelle/diffsurv. (Lines 267-269) |
| Open Datasets | Yes | Semi-synthetic surv SVHN: Based on the Street View House Numbers (SVHN) dataset [Netzer et al., 2011] (Lines 269-270) We assess our methods on several public datasets: Four small, popular real-world survival datasets (FLCHAIN, NWTCO, SUPPORT, METABRIC) [Kvamme et al., 2019] and the MIMIC IV Chest X-Ray dataset (CXR) with death as the event [Johnson et al., 2019]. (Lines 294-297) |
| Dataset Splits | Yes | Validation approach varies: for smaller datasets, we apply nested 5-fold cross-validation, while for imaging datasets we use train:val:test splits. (Lines 261-262) For surv SVHN the train:val:test split is provided by Netzer et al. [2011] as is 230,755:5,000:13,068. (Lines 493-494) Finally, the train:val:test split of 8:1:1 is done at the patient level ensuring no images from a patient in the test set was found in the training data. (Lines 486-487) |
| Hardware Specification | Yes | In the most demanding case, the MIMIC IV CXR experiments, run on an 11GB NVIDIA Ge Force GTX 1080 Ti, took roughly 18.5 hours per experiment. (Lines 520-521) |
| Software Dependencies | No | All neural network baselines were implemented using Py Torch and Py Torch Lightning. (Lines 523-524) The paper mentions the software used but does not provide specific version numbers for these dependencies. |
| Experiment Setup | Yes | We performed hyperparameter tuning for learning rate, weight decay, batch size, and risk set size. (Lines 263-264) Table 4: Hyperparameter values for small real-world datasets. Hyperparameter Values Learning rate [0.1, 0.01, 0.001, 1e-4] Weight decay [0.1, 0.01, 0.001, 1e-4, 1e-5, 0] (Batch size, risk set size) [(32, 8), (16, 16), (8, 32), (4, 64), (1, 256)] (Lines 498-499) For imaging datasets, we fix learning rate and weight decay for both CPL and Diffsurv. For both surv SVHN and MIMIC IV CXR, we use a fixed learning rate of 10-4 and weight decay of 10-5. We also used early stopping with a patience of 20 epochs and a maximum of 100,000 training steps. (Lines 500-502) |