Clustering Interval-Censored Time-Series for Disease Phenotyping

Authors: Irene Y. Chen, Rahul G. Krishnan, David Sontag6211-6221

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental On synthetic data, we demonstrate accurate, stable, and interpretable results that outperform several benchmarks. On real-world clinical datasets of heart failure and Parkinson s disease patients, we study how interval censoring can adversely affect the task of disease phenotyping. Our model corrects for this source of error and recovers known clinical subtypes.
Researcher Affiliation Academia Irene Y. Chen1, Rahul G. Krishnan2, David Sontag1 1MIT CSAIL and IMES 2University of Toronto iychen@csail.mit.edu, rahulgk@cs.toronto.edu, dsontag@csail.mit.edu
Pseudocode Yes Figure 1(c) describes the graphical model, and Algorithm 1 depicts the pseudocode for this procedure.
Open Source Code No The paper provides links to open-source implementations of *baseline* methods (e.g., Su Sta In, PAGA, DTW, SPARTan) but does not provide concrete access or state the availability of open-source code for its own proposed method, Sub Lign.
Open Datasets Yes Parkinson s disease (PD): We use publicly-available data from the Parkinson s Progression Markers Initiative (PPMI), an observational clinical study, totalling Nt = 423 PD patients and Nc = 196 healthy controls where N = Nt + Nc.
Dataset Splits Yes We evaluate models on 5 trials, each with a different randomized data split and random seed. For each trial, we learn on a training set (60%), find the best performance across all hyperparameters on the validation set (20%), and report the performance metrics on the held-out test set (20%).
Hardware Specification Yes Our models are implemented in Python 3.7 using Py Torch (Paszke et al. 2019) and are learned via Adam (Kingma and Ba 2014) on a single NVIDIA k80 GPU for 1000 epochs.
Software Dependencies No The paper mentions 'Python 3.7' and 'Py Torch' but does not provide specific version numbers for PyTorch or other key libraries/solvers needed for reproduction. While Python 3.7 is versioned, PyTorch is not, which prevents full reproducibility of the software environment.
Experiment Setup Yes We find optimal hyperparameters via grid search. For both synthetic and clinical experiments, we search over hyperparameters including dimensions of the latent space z (2, 5, 10), the number of hidden units in the RNN (50, 100, 200), the number of hidden units in the multi-layer perceptron (50, 100, 200), the learning rate (0.001, 0.01, 0.1, 1.), regularization parameter (0., 0.1, 1.), and regularization type (L1, L2). We set alignment extrema δ+ = 10 based on the maximum of the synthetic dataset and the maxima of the HF and PD datasets. We search over 50 time steps with ϵ = 0.1. For all models, we run for 1000 epochs...