A Hierarchical Approach to Multi-Event Survival Analysis

Authors: Donna Tjandra, Yifei He, Jenna Wiens591-599

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate the proposed approach across several publicly available datasets in terms of both intra-event, inter-individual (global) and intraindividual, inter-event (local) consistency. We show that the proposed method consistently outperforms well-accepted and commonly used approaches to multi-event survival analysis. When estimating survival curves for Alzheimer s disease and mortality, our approach achieves a C-index of 0.91 (95% CI 0.88-0.93) and a local consistency score of 0.97 (95% CI 0.94-0.98) compared to a C-index of 0.75 (95% CI 0.70-0.80) and a local consistency score of 0.94 (95% CI 0.91-0.97) when modeling each event separately. Overall, our approach improves the accuracy of survival predictions by iteratively reducing the original task to a set of nested, simpler subtasks.
Researcher Affiliation Academia Donna Tjandra,1 Yifei He, 1 Jenna Wiens 1 1Computer Science and Engineering, University of Michigan, Ann Arbor MI, USA dotjandr, heyifei, wiensj@umich.edu
Pseudocode Yes Algorithm 1 Hierarchical prediction for event k. ˆpkm is the predicted probability distribution at granularity m. Input: θ(x), learned representation from θ Output: ˆpk, estimated distribution of occurrence at the original time scale Hierarchical Prediction(θ(x)) 1: ˆpk1 φk1(θ(x)) ˆP(e T) 2: for m = 2 to M do 3: ˆpkm φkm(θ(x)) conditional probabilities 4: for z = 1 to T/bm do 5: zm 1 (z 1)bm/bm 1 + 1 time index 6: ˆpkm[z] ˆpkm[z] ˆpkm 1[zm 1] marginal 7: ˆpk ˆpk M 8: return ˆpk
Open Source Code Yes More details about the code (https://gitlab.eecs.umich.edu/mld3/hierarchical-survival-analysis) are given in the Supplement.
Open Datasets Yes We used four datasets. One was a multi-event synthetic dataset... The remainder were real, publicly available datasets from the health domain. ADNI: a publicly available dataset containing data on Alzheimer s disease (AD)1. ...adni.loni.usc.edu. MIMIC-III: a publicly available dataset of electronic health record data (Johnson et al. 2016). SEER: (https://seer.cancer.gov/data/): a publicly available dataset containing cancer incidence data from population-based registries
Dataset Splits Yes For each dataset, we randomly split the data into 60/20/20% training/validation/test, and data from the same individual did not appear across splits.
Hardware Specification No The paper does not provide specific details about the hardware used to run the experiments, such as GPU models, CPU types, or memory specifications.
Software Dependencies No All models were trained in Python3.6 and Pytorch (Paszke et al. 2017), using Adam (Kingma and Ba 2014). The paper specifies Python 3.6, but does not provide version numbers for PyTorch or Adam (optimizer).
Experiment Setup No Hyperparameters, including the learning rate, L2 regularization constant, and objective function scalars (e.g., α), were tuned using a random grid search, with a budget of 20. We used early stopping based on validation set performance, where we aimed to maximize the average of the proposed global and local consistencies. All network layers were initialized with Xavier initialization from a uniform distribution. The paper mentions that hyperparameters were tuned but does not provide the specific values used for these hyperparameters or other concrete training configurations.