reproducibilityindex.ai

Deep Contextual Clinical Prediction with Reverse Distillation

Authors: Rohan Kodialam, Rebecca Boiarsky, Justin Lim, Aditya Sai, Neil Dixit, David Sontag249-258

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	SARD outperforms state-of-the-art methods on multiple clinical prediction outcomes, with ablation studies revealing that reverse distillation is a primary driver of these improvements.
Researcher Affiliation	Collaboration	1MIT CSAIL & IMES 2Independence Blue Cross
Pseudocode	No	No pseudocode or algorithm blocks are provided; the architecture is illustrated in Figure 1.
Open Source Code	Yes	Code is available at https://github.com/clinicalml/omop-learn.
Open Datasets	No	OMOP provides a normalized concept vocabulary, and although our dataset is not public, hundreds of health institutions with data in an OMOP CDM can use our code out-of-the-box to reproduce results on local datasets
Dataset Splits	Yes	We split the 121, 593 patients into training, validation, and test sets of size 82, 955, 19, 319, and 19, 319 respectively.
Hardware Specification	Yes	We train using a single NVIDIA k80 GPU.
Software Dependencies	No	Our algorithms are implemented in Python 3.6 and use the PyTorch autograd library (Paszke et al. 2019).
Experiment Setup	Yes	We train our deep models using an ADAM optimizer (Kingma and Ba 2014) with the hyperparameter settings of β1 = 0.9, β2 = 0.98, ϵ = 10^-9 and a learning rate of η = 2 10^-4. A batch size of 500 patients was used for ADAM updates. SARD models are trained with de = 300 and K = 10; we found that validation performance did not increase with larger embedding sizes or number of convolutional kernels. We apply dropout with probability ρt d = 0.05 after each self-attention block to prevent overﬁtting.