reproducibilityindex.ai

Clustering Interval-Censored Time-Series for Disease Phenotyping

Authors: Irene Y. Chen, Rahul G. Krishnan, David Sontag6211-6221

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	On synthetic data, we demonstrate accurate, stable, and interpretable results that outperform several benchmarks. On real-world clinical datasets of heart failure and Parkinson s disease patients, we study how interval censoring can adversely affect the task of disease phenotyping. Our model corrects for this source of error and recovers known clinical subtypes.
Researcher Affiliation	Academia	Irene Y. Chen1, Rahul G. Krishnan2, David Sontag1 1MIT CSAIL and IMES 2University of Toronto iychen@csail.mit.edu, rahulgk@cs.toronto.edu, dsontag@csail.mit.edu
Pseudocode	Yes	Figure 1(c) describes the graphical model, and Algorithm 1 depicts the pseudocode for this procedure.
Open Source Code	No	The paper provides links to open-source implementations of baseline methods (e.g., Su Sta In, PAGA, DTW, SPARTan) but does not provide concrete access or state the availability of open-source code for its own proposed method, Sub Lign.
Open Datasets	Yes	Parkinson s disease (PD): We use publicly-available data from the Parkinson s Progression Markers Initiative (PPMI), an observational clinical study, totalling Nt = 423 PD patients and Nc = 196 healthy controls where N = Nt + Nc.
Dataset Splits	Yes	We evaluate models on 5 trials, each with a different randomized data split and random seed. For each trial, we learn on a training set (60%), ﬁnd the best performance across all hyperparameters on the validation set (20%), and report the performance metrics on the held-out test set (20%).
Hardware Specification	Yes	Our models are implemented in Python 3.7 using Py Torch (Paszke et al. 2019) and are learned via Adam (Kingma and Ba 2014) on a single NVIDIA k80 GPU for 1000 epochs.
Software Dependencies	No	The paper mentions 'Python 3.7' and 'Py Torch' but does not provide specific version numbers for PyTorch or other key libraries/solvers needed for reproduction. While Python 3.7 is versioned, PyTorch is not, which prevents full reproducibility of the software environment.
Experiment Setup	Yes	We ﬁnd optimal hyperparameters via grid search. For both synthetic and clinical experiments, we search over hyperparameters including dimensions of the latent space z (2, 5, 10), the number of hidden units in the RNN (50, 100, 200), the number of hidden units in the multi-layer perceptron (50, 100, 200), the learning rate (0.001, 0.01, 0.1, 1.), regularization parameter (0., 0.1, 1.), and regularization type (L1, L2). We set alignment extrema δ+ = 10 based on the maximum of the synthetic dataset and the maxima of the HF and PD datasets. We search over 50 time steps with ϵ = 0.1. For all models, we run for 1000 epochs...