reproducibilityindex.ai

CONAN: Complementary Pattern Augmentation for Rare Disease Detection

Authors: Limeng Cui, Siddharth Biswal, Lucas M. Glass, Greg Lever, Jimeng Sun, Cao Xiao614-621

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluated CONAN on two disease detection tasks. For low prevalence inﬂammatory bowel disease (IBD) detection, CONAN achieved .96 precision recall area under the curve (PR-AUC) and 50.1% relative improvement over the best baseline. For rare disease idiopathic pulmonary ﬁbrosis (IPF) detection, CONAN achieves .22 PR-AUC with 41.3% relative improvement over the best baseline.
Researcher Affiliation	Collaboration	1Analytic Center of Excellence, IQVIA, Cambridge, MA, USA 2College of Information Sciences and Technology, The Pennsylvania State University, PA, USA 3College of Computing, Georgia Institute of Technology, Atlanta, GA, USA
Pseudocode	Yes	Algorithm 1: CONAN for Rare Disease Detection.
Open Source Code	Yes	We implement all models with Keras 1.https://github.com/cuilimeng/CONAN
Open Datasets	No	We leverage data from IQVIA longitudinal prescription (Rx) and medical claims (Dx) databases, which include hundreds of millions patients clinical records.
Dataset Splits	No	We sample two imbalanced training sets for each dataset, with a ratio of 10% and 1% for positive samples. For the testing set, we extract the data using the actual disease prevalence rate shown in Table 2.
Hardware Specification	Yes	All methods are trained on an Ubuntu 16.04 with 128GB memory and Nvidia Tesla P100 GPU.
Software Dependencies	No	We implement all models with Keras 1. The paper mentions Keras but does not provide a specific version number (e.g., Keras 2.x.x).
Experiment Setup	Yes	We set 128 for dimensions of patient embedding. For the complementary GAN... The training epoch of complementary GAN is 1000. For all models, we use RMSProp (Hinton, Srivastava, and Swersky 2012) with a minibatch of 512 patients, and the training epoch is 30. In order to have a fair comparison, we use focal loss (with γ = 2 and α = 0.25) and set the output dimension as 128 for all models.