Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Condensed Memory Networks for Clinical Diagnostic Inferencing

Authors: Aaditya Prakash, Siyuan Zhao, Sadid Hasan, Vivek Datla, Kathy Lee, Ashequl Qadir, Joey Liu, Oladimeji Farri

AAAI 2017 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on the MIMIC-III dataset show that the proposed model outperforms other variants of memory networks to predict the most probable diagnoses given a complex clinical scenario.
Researcher Affiliation Collaboration Aaditya Prakash Brandeis University, MA EMAIL Siyuan Zhao Worcester Polytechnic Institute, MA EMAIL Sadid A. Hasan, Vivek Datla, Kathy Lee, Ashequl Qadir, Joey Liu, Oladimeji Farri Artificial Intelligence Laboratory, Philips Research North America, Cambridge, MA EMAIL
Pseudocode No The paper does not contain any pseudocode or clearly labeled algorithm blocks.
Open Source Code No The paper does not provide any link or explicit statement about the availability of open-source code for the described methodology.
Open Datasets Yes We use the noteevents table from MIMIC-III: v1.3, which contains the unstructured free-text clinical notes for patients. MIMIC-III (Multiparameter Intelligent Monitoring in Intensive Care) (Johnson et al. 2016) is a large freely-available clinical database.
Dataset Splits Yes Models are trained on 80% of the data and validated on 10%. The remaining 10% is used as test set which is evaluated only once across all experiments with different models.
Hardware Specification No The paper mentions 'Training time of our model for GPU implementation' but does not specify any particular GPU model or other hardware details (CPU, memory, etc.).
Software Dependencies No The paper mentions using 'Adam (Kingma and Ba 2014) stochastic gradient descent' for optimization, but it does not specify versions for programming languages, libraries, or other software components.
Experiment Setup Yes The learning rate is set to 0.001 and batch size for each iteration to 100 for all models. For the final prediction layer, we use a fully connected layer on top of the output from equation 5 with a sigmoid activation function. The loss function is the sum of cross entropy from prediction labels and prediction memory slots using addressing schema. Complexity of the model was penalized by adding L2 regularization to the cross entropy loss function. We use dropout (Srivastava et al. 2014) with probability 0.5 on the output-to-decision sigmoid layer and limit the norm of the gradients to be below 20.