Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Variational Disentanglement for Rare Event Modeling

Authors: Zidi Xiu, Chenyang Tao, Michael Gao, Connor Davis, Benjamin A. Goldstein, Ricardo Henao10469-10477

AAAI 2021 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Results on synthetic studies and diverse real-world datasets, including mortality prediction on a COVID-19 cohort, demonstrate that the proposed approach outperforms existing alternatives.
Researcher Affiliation	Academia	1 Duke University 2 Duke Institute for Health Innovation EMAIL, EMAIL, EMAIL
Pseudocode	Yes	Algorithm 1: Variational Inference with Extremals.
Open Source Code	Yes	Our implementation is based on Py Torch, and code to replicate our experiments are available from https://github.com/Zidi Xiu/VIE/.
Open Datasets	Yes	To this end, we synthesize a semi-synthetic dataset based on the Framingham study (Mitchell et al. 2010), a long-term cardiovascular survival cohort study... (ii) In P (O Brien et al. 2020): An in-patient data from DUHS... (iii) SEER (Ries et al. 2007): A public dataset studying cancer survival among adults curated by the U.S. Surveillance, Epidemiology, and End Results (SEER) Program...
Dataset Splits	Yes	Datasets have been randomly split into training, validation, and testing datasets with ratio 6:2:2.
Hardware Specification	No	The paper does not provide specific details about the hardware used for experiments, such as CPU or GPU models, memory, or cloud instance types.
Software Dependencies	No	The paper states, "Our implementation is based on Py Torch," but does not specify its version number or any other software dependencies with their respective versions.
Experiment Setup	Yes	Datasets have been randomly split into training, validation, and testing datasets with ratio 6:2:2... In simulation studies, we repeat simulation ten times to obtain empirical AUC and AUPRC conﬁdence intervals. For real world datasets, we applied bootstrapping to estimate the conﬁdence intervals... For detailed settings please refer to the SM.