reproducibilityindex.ai

Doctor2Vec: Dynamic Doctor Representation Learning for Clinical Trial Recruitment

Authors: Siddharth Biswal, Cao Xiao, Lucas M. Glass, Elizabeth Milkovits, Jimeng Sun557-564

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Validated on large real-world trials and EHR data including 2,609 trials, 25K doctors and 430K patients, Doctor2Vec demonstrated improved performance over the best baseline by up to 8.7% in PR-AUC.
Researcher Affiliation	Collaboration	1Analytic Center of Excellence, IQVIA, Cambridge, USA 2Computational Science and Engineering, Georgia Institute of Technology, Atlanta, USA
Pseudocode	Yes	Algorithm 1: Model Training for Doctor2Vec
Open Source Code	Yes	Code: https://github.com/sidsearch/Doctor2vec
Open Datasets	No	We extracted trial data from IQVIA s real-world patient and clinical trial database, which can be accessed by request 3. This dataset contains a longitudinal treatment history from 430,239 patients over 7 years. In addition to medical codes about diagnosis, procedure, medication, it also includes information about doctors such as specialty, education, hospital location, geographical location.
Dataset Splits	Yes	We split our data into train, test, validation split with 70:20:10 ratio.
Hardware Specification	Yes	The training was performed on a machine equipped with an Ubuntu 16.04 with 128GB memory and Nvidia Tesla P100 GPU.
Software Dependencies	Yes	We implemented Doctor2Vec 2 with Py Torch 1.0 (Paszke et al. 2017).
Experiment Setup	Yes	For training the model, we used Adam (Kingma and Ba 2014a) with the mini-batch of 128 samples. We used Adam (Kingma and Ba 2014b) optimizer at learning rate 0.001 with learning rate decay.