Doctor2Vec: Dynamic Doctor Representation Learning for Clinical Trial Recruitment
Authors: Siddharth Biswal, Cao Xiao, Lucas M. Glass, Elizabeth Milkovits, Jimeng Sun557-564
AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Validated on large real-world trials and EHR data including 2,609 trials, 25K doctors and 430K patients, Doctor2Vec demonstrated improved performance over the best baseline by up to 8.7% in PR-AUC. |
| Researcher Affiliation | Collaboration | 1Analytic Center of Excellence, IQVIA, Cambridge, USA 2Computational Science and Engineering, Georgia Institute of Technology, Atlanta, USA |
| Pseudocode | Yes | Algorithm 1: Model Training for Doctor2Vec |
| Open Source Code | Yes | Code: https://github.com/sidsearch/Doctor2vec |
| Open Datasets | No | We extracted trial data from IQVIA s real-world patient and clinical trial database, which can be accessed by request 3. This dataset contains a longitudinal treatment history from 430,239 patients over 7 years. In addition to medical codes about diagnosis, procedure, medication, it also includes information about doctors such as specialty, education, hospital location, geographical location. |
| Dataset Splits | Yes | We split our data into train, test, validation split with 70:20:10 ratio. |
| Hardware Specification | Yes | The training was performed on a machine equipped with an Ubuntu 16.04 with 128GB memory and Nvidia Tesla P100 GPU. |
| Software Dependencies | Yes | We implemented Doctor2Vec 2 with Py Torch 1.0 (Paszke et al. 2017). |
| Experiment Setup | Yes | For training the model, we used Adam (Kingma and Ba 2014a) with the mini-batch of 128 samples. We used Adam (Kingma and Ba 2014b) optimizer at learning rate 0.001 with learning rate decay. |