reproducibilityindex.ai

A Flexible Generative Model for Heterogeneous Tabular EHR with Missing Modality

Authors: Huan He, William hao, Yuanzhe Xi, Yong Chen, Bradley Malin, Joyce Ho

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We empirically show that our model consistently outperforms existing state-of-the-art synthetic EHR generation methods both in ﬁdelity by up to 3.10% and utility by up to 7.16%. Additionally, we show that our method can be successfully used in privacy-sensitive settings, where the original patient-level data cannot be shared.
Researcher Affiliation	Academia	Huan He Department of Biostatistics University of Pennsylvania huan.he@pennmedicine.upenn.edu William Hao Department of Computer Science Emory University william.hao@emory.edu Yuanzhe Xi Department of Mathematics Emory University yuanzhe.xi@emory.edu Yong Chen Department of Biostatistics University of Pennsylvania ychen123@upenn.edu Bradley Malin Department of Biomedical Informatics Vanderbilt University b.malin@vumc.org Joyce C Ho Department of Biostatistics Emory University joyce.c.ho@emory.edu
Pseudocode	Yes	A.4 ALGORITHM OF FLEXGEN-EHR Algorithm 1: Training of FLEXGEN-EHR
Open Source Code	No	The paper states that codes for baseline models are available online (with links provided), but does not provide an explicit statement or link for the source code of FLEXGEN-EHR itself.
Open Datasets	Yes	We use two real-world de-identiﬁed EHR datasets, MIMIC-III (Johnson et al., 2016) and e ICU (Pollard et al., 2018).
Dataset Splits	No	The paper does not provide specific percentages or methodology for train/validation/test splits, nor does it explicitly mention a validation set. It mentions using 'test datasets' but not the splitting strategy.
Hardware Specification	Yes	For training the models, we used Adam (Kingma & Ba, 2015) with the learning rate set to 0.001, and a mini-batch of 128 on a machine equipped with one Nvidia Ge Force RTX 3090 and CUDA 11.2.
Software Dependencies	Yes	We implemented FLEXGEN-EHR with Py Torch. For training the models, we used Adam (Kingma & Ba, 2015) with the learning rate set to 0.001, and a mini-batch of 128 on a machine equipped with one Nvidia Ge Force RTX 3090 and CUDA 11.2.
Experiment Setup	Yes	For training the models, we used Adam (Kingma & Ba, 2015) with the learning rate set to 0.001, and a mini-batch of 128... Hyperparamters of FLEXGEN-EHR are selected after grid search. We use a timestep of 50 and a noise scheduling β from 1 10 4 to 1 10 2.