Learning to Diagnose with LSTM Recurrent Neural Networks
Authors: Zachary Lipton, David Kale, Charles Elkan, Randall Wetzel
ICLR 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We present the first study to empirically evaluate the ability of LSTMs to recognize patterns in multivariate time series of clinical measurements. Specifically, we consider multilabel classification of diagnoses, training a model to classify 128 diagnoses given 13 frequently but irregularly sampled clinical measurements. First, we establish the effectiveness of a simple LSTM network for modeling clinical data. Then we demonstrate a straightforward and effective training strategy in which we replicate targets at each sequence step. Trained only on raw time series, our models outperform several strong baselines, including a multilayer perceptron trained on hand-engineered features. |
| Researcher Affiliation | Academia | Zachary C. Lipton Department of Computer Science and Engineering University of California, San Diego La Jolla, CA 92093, USA zlipton@cs.ucsd.edu David C. Kale Department of Computer Science PUSH OVER TO LEFT University of Southern California Los Angeles, CA 90089 dkale@usc.edu Charles Elkan Department of Computer Science and Engineering University of California, San Diego La Jolla, CA 92093, USA elkan@cs.ucsd.edu Randall Wetzel Laura P. and Leland K. Whittier Virtual PICU Children s Hospital Los Angeles Los Angeles, CA 90027 rwetzel@chla.usc.edu |
| Pseudocode | No | The paper includes equations for LSTM updates but does not provide any pseudocode or algorithm blocks that are explicitly labeled as such or formatted as structured algorithms. |
| Open Source Code | No | The paper does not provide any explicit statement or link indicating that the source code for the described methodology is open-source or publicly available. |
| Open Datasets | No | Our experiments use a collection of anonymized clinical time series extracted from the EHR system at Children s Hospital LA (Marlin et al., 2012; Che et al., 2015) as part of an IRB-approved study. The paper describes an internal dataset and does not provide any access information (like a URL, DOI, or repository) for public use. |
| Dataset Splits | Yes | All models are trained on 80% of the data and tested on 10%. The remaining 10% is used as a validation set. |
| Hardware Specification | Yes | We acknowledge NVIDIA Corporation for Tesla K40 GPU hardware donation |
| Software Dependencies | No | The paper mentions software components and techniques (e.g., LSTMs, SGD, dropout) but does not specify exact version numbers for any libraries, frameworks, or programming languages used in the implementation. |
| Experiment Setup | Yes | We train each LSTM for 100 epochs using stochastic gradient descent (SGD) with momentum. To combat exploding gradients, we scale the norm of the gradient and use ℓ2 2 weight decay of 10 6, both hyperparameters chosen using validation data. Our final networks use 2 hidden layers and either 64 memory cells per layer with no dropout or 128 cells per layer with dropout of 0.5. These architectures are also chosen based on validation performance. |