Doctor2Vec: Dynamic Doctor Representation Learning for Clinical Trial Recruitment

Authors: Siddharth Biswal, Cao Xiao, Lucas M. Glass, Elizabeth Milkovits, Jimeng Sun557-564

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Validated on large real-world trials and EHR data including 2,609 trials, 25K doctors and 430K patients, Doctor2Vec demonstrated improved performance over the best baseline by up to 8.7% in PR-AUC.
Researcher Affiliation Collaboration 1Analytic Center of Excellence, IQVIA, Cambridge, USA 2Computational Science and Engineering, Georgia Institute of Technology, Atlanta, USA
Pseudocode Yes Algorithm 1: Model Training for Doctor2Vec
Open Source Code Yes Code: https://github.com/sidsearch/Doctor2vec
Open Datasets No We extracted trial data from IQVIA s real-world patient and clinical trial database, which can be accessed by request 3. This dataset contains a longitudinal treatment history from 430,239 patients over 7 years. In addition to medical codes about diagnosis, procedure, medication, it also includes information about doctors such as specialty, education, hospital location, geographical location.
Dataset Splits Yes We split our data into train, test, validation split with 70:20:10 ratio.
Hardware Specification Yes The training was performed on a machine equipped with an Ubuntu 16.04 with 128GB memory and Nvidia Tesla P100 GPU.
Software Dependencies Yes We implemented Doctor2Vec 2 with Py Torch 1.0 (Paszke et al. 2017).
Experiment Setup Yes For training the model, we used Adam (Kingma and Ba 2014a) with the mini-batch of 128 samples. We used Adam (Kingma and Ba 2014b) optimizer at learning rate 0.001 with learning rate decay.