reproducibilityindex.ai

Gradient Importance Learning for Incomplete Observations

Authors: Qitong Gao, Dong Wang, Joshua David Amason, Siyang Yuan, Chenyang Tao, Ricardo Henao, Majda Hadziahmetovic, Lawrence Carin, Miroslav Pajic

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We test the approach on real-world time-series (i.e., MIMIC-III), tabular data obtained from an eye clinic, and a standard dataset (i.e., MNIST), where our imputation-free predictions outperform the traditional two-step imputation-based predictions using state-of-the-art imputation methods.
Researcher Affiliation	Academia	Duke University, USA. King Abdullah University of Science and Technology, Saudi Arabia.
Pseudocode	Yes	Algorithm 1 Gradient Importance Learning (GIL).
Open Source Code	Yes	Code available at https://github.com/gaoqitong/gradient-importance-learning.
Open Datasets	Yes	The datasets we use include i) MIMIC-III (Johnson et al., 2016) that consists of real-world EHRs obtained from intensive care units (ICUs), ii) a de-identiﬁed ophthalmic patient dataset obtained from an eye center in North America, and iii) hand-written digits MNIST (Le Cun & Cortes). We also tested on a smaller scaled ICU time-series from 2012 Physionet challenge (Silva et al., 2012) and these results can be found in Appendix D.4.
Dataset Splits	Yes	All patients selected following the above procedure are split into 8:2 to formulate the training and testing datasets... We split all the subjects into a training cohort and a testing cohort following a ratio of 9:1.
Hardware Specification	Yes	The case studies are run on a work station with three Nvidia Quadro RTX 6000 GPUs with 24GB of memory for each.
Software Dependencies	No	The paper mentions "We use Tensorﬂow to implement the models and training algorithms." but does not provide a specific version number for TensorFlow or any other software dependencies.
Experiment Setup	Yes	To train the imputation-free prediction models using GIL, we perform a grid search for the model learning rate α {0.001, 0.0007, 0.0005, 0.0003, 0.0001, 0.00005, 0.00001}, the exponential decay step for α is selected from {1000, 750, 500} and the exponential decay rate for α is selected from {0.95, 0.9, 0.85, 0.8}. The actor πθ and critic Qν in the GIL (i.e., Alg. 1) are trained using deep deterministic policy gradient (DDPG) Lillicrap et al. (2015) where the discounting factor γ = 0.99. ... Adam optimizer is used to train all the prediction models for baselines... All the models are trained using a batch size of 128.