DeepHeart: Semi-Supervised Sequence Learning for Cardiovascular Risk Prediction

Authors: Brandon Ballinger, Johnson Hsieh, Avesh Singh, Nimit Sohoni, Jack Wang, Geoffrey Tison, Gregory Marcus, Jose Sanchez, Carol Maguire, Jeffrey Olgin, Mark Pletcher

AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We train and validate a semi-supervised, multi-task LSTM on 57,675 person-weeks of data from off-the-shelf wearable heart rate sensors, showing high accuracy at detecting multiple medical conditions, including diabetes (0.8451), high cholesterol (0.7441), high blood pressure (0.8086), and sleep apnea (0.8298). We compare two semi-supervised training methods, semi-supervised sequence learning and heuristic pretraining, and show they outperform hand-engineered biomarkers from the medical literature. We believe our work suggests a new approach to patient risk stratification based on cardiovascular risk scores derived from popular wearables such as Fitbit, Apple Watch, or Android Wear.
Researcher Affiliation Collaboration Brandon Ballinger, Johnson Hsieh, Avesh Singh,Nimit Sohoni, Jack Wang Cardiogram San Francisco, CA Geoffrey H. Tison, Gregory M. Marcus, Jose M. Sanchez, Carol Maguire, Jeffrey E. Olgin, Mark J. Pletcher Department of Medicine, University of California San Francisco, CA
Pseudocode No The paper describes the model architecture and training process in text and diagrams, but it does not include any pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any links to source code repositories or explicitly state that its code is being released or made publicly available.
Open Datasets No The paper describes recruiting participants for a study ('We recruited 14,011 users of a popular Apple Watch app into an online, IRB-approved study run in partnership with the cardiology department of the University of California, San Francisco (Tison et al. 2017a)') and creating a dataset from their wearable data. However, it does not provide a public link, DOI, or specific citation for accessing this created dataset.
Dataset Splits Yes Each participant was randomly assigned to either the training, tuning, or testing set... After filtering, there were 57,675 person-weeks of data in total, divided into 33,628 for training, 18,555 for tuning, and 12,790 for validation.
Hardware Specification No The paper discusses data collection from 'off-the-shelf wearable heart rate sensors' like Fitbit and Apple Watch, but it does not specify any hardware details (e.g., GPU models, CPU types, or cloud instances) used for training or evaluating the deep learning model.
Software Dependencies No The paper mentions using 'the Adam optimizer' and 'scikit-learn’s implementation of logistic regression, support vector machines, decision trees, random forests, and multi-layer perceptrons', but it does not provide specific version numbers for these software components or libraries.
Experiment Setup Yes The first layer has a wide filter length of 12... and the next two layers use the residual units... with a filter length of 5. Each convolutional layer contains 128 convolutional channels. After each convolutional layer, we apply dropout with probability 0.2... and apply max pooling with pool length 2... Each bidirectional LSTM layer contains 128 units (64 in each direction). A dropout of 0.2 is applied to this final LSTM layer... Each experiment used the Adam optimizer (Kingma and Ba 2014) and a squared error loss... Table 3 summarizes hyperparameter tuning experiments, listing Width, Conv Depth, LSTM Depth, and Initial Filter.