reproducibilityindex.ai

Condensed Memory Networks for Clinical Diagnostic Inferencing

Authors: Aaditya Prakash, Siyuan Zhao, Sadid Hasan, Vivek Datla, Kathy Lee, Ashequl Qadir, Joey Liu, Oladimeji Farri

AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on the MIMIC-III dataset show that the proposed model outperforms other variants of memory networks to predict the most probable diagnoses given a complex clinical scenario.
Researcher Affiliation	Collaboration	Aaditya Prakash Brandeis University, MA aprakash@brandeis.edu Siyuan Zhao Worcester Polytechnic Institute, MA szhao@wpi.edu Sadid A. Hasan, Vivek Datla, Kathy Lee, Ashequl Qadir, Joey Liu, Oladimeji Farri Artiﬁcial Intelligence Laboratory, Philips Research North America, Cambridge, MA {ﬁrstname.lastname,kathy.lee 1,dimeji.farri}@philips.com
Pseudocode	No	The paper does not contain any pseudocode or clearly labeled algorithm blocks.
Open Source Code	No	The paper does not provide any link or explicit statement about the availability of open-source code for the described methodology.
Open Datasets	Yes	We use the noteevents table from MIMIC-III: v1.3, which contains the unstructured free-text clinical notes for patients. MIMIC-III (Multiparameter Intelligent Monitoring in Intensive Care) (Johnson et al. 2016) is a large freely-available clinical database.
Dataset Splits	Yes	Models are trained on 80% of the data and validated on 10%. The remaining 10% is used as test set which is evaluated only once across all experiments with different models.
Hardware Specification	No	The paper mentions 'Training time of our model for GPU implementation' but does not specify any particular GPU model or other hardware details (CPU, memory, etc.).
Software Dependencies	No	The paper mentions using 'Adam (Kingma and Ba 2014) stochastic gradient descent' for optimization, but it does not specify versions for programming languages, libraries, or other software components.
Experiment Setup	Yes	The learning rate is set to 0.001 and batch size for each iteration to 100 for all models. For the ﬁnal prediction layer, we use a fully connected layer on top of the output from equation 5 with a sigmoid activation function. The loss function is the sum of cross entropy from prediction labels and prediction memory slots using addressing schema. Complexity of the model was penalized by adding L2 regularization to the cross entropy loss function. We use dropout (Srivastava et al. 2014) with probability 0.5 on the output-to-decision sigmoid layer and limit the norm of the gradients to be below 20.