Gradient Importance Learning for Incomplete Observations
Authors: Qitong Gao, Dong Wang, Joshua David Amason, Siyang Yuan, Chenyang Tao, Ricardo Henao, Majda Hadziahmetovic, Lawrence Carin, Miroslav Pajic
ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We test the approach on real-world time-series (i.e., MIMIC-III), tabular data obtained from an eye clinic, and a standard dataset (i.e., MNIST), where our imputation-free predictions outperform the traditional two-step imputation-based predictions using state-of-the-art imputation methods. |
| Researcher Affiliation | Academia | Duke University, USA. King Abdullah University of Science and Technology, Saudi Arabia. |
| Pseudocode | Yes | Algorithm 1 Gradient Importance Learning (GIL). |
| Open Source Code | Yes | Code available at https://github.com/gaoqitong/gradient-importance-learning. |
| Open Datasets | Yes | The datasets we use include i) MIMIC-III (Johnson et al., 2016) that consists of real-world EHRs obtained from intensive care units (ICUs), ii) a de-identified ophthalmic patient dataset obtained from an eye center in North America, and iii) hand-written digits MNIST (Le Cun & Cortes). We also tested on a smaller scaled ICU time-series from 2012 Physionet challenge (Silva et al., 2012) and these results can be found in Appendix D.4. |
| Dataset Splits | Yes | All patients selected following the above procedure are split into 8:2 to formulate the training and testing datasets... We split all the subjects into a training cohort and a testing cohort following a ratio of 9:1. |
| Hardware Specification | Yes | The case studies are run on a work station with three Nvidia Quadro RTX 6000 GPUs with 24GB of memory for each. |
| Software Dependencies | No | The paper mentions "We use Tensorflow to implement the models and training algorithms." but does not provide a specific version number for TensorFlow or any other software dependencies. |
| Experiment Setup | Yes | To train the imputation-free prediction models using GIL, we perform a grid search for the model learning rate α {0.001, 0.0007, 0.0005, 0.0003, 0.0001, 0.00005, 0.00001}, the exponential decay step for α is selected from {1000, 750, 500} and the exponential decay rate for α is selected from {0.95, 0.9, 0.85, 0.8}. The actor πθ and critic Qν in the GIL (i.e., Alg. 1) are trained using deep deterministic policy gradient (DDPG) Lillicrap et al. (2015) where the discounting factor γ = 0.99. ... Adam optimizer is used to train all the prediction models for baselines... All the models are trained using a batch size of 128. |