reproducibilityindex.ai

CoDrug: Conformal Drug Property Prediction with Density Estimation under Covariate Shift

Authors: Siddhartha Laghuvarapu, Zhen Lin, Jimeng Sun

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In extensive experiments involving realistic distribution drifts in various small-molecule drug discovery tasks, we demonstrate the ability of Co Drug to provide valid prediction sets and its utility in addressing the distribution shift arising from de novo drug design models. On average, using Co Drug can reduce the coverage gap by over 35% when compared to conformal prediction sets not adjusted for covariate shift.
Researcher Affiliation	Academia	Siddhartha Laghuvarapu Department of Computer Science University of Illinois Urbana-Champaign Urbana, IL 61801 sl160@illinois.edu Zhen Lin Department of Computer Science University of Illinois Urbana-Champaign Urbana, IL 61801 zhenlin4@illinois.edu Jimeng Sun Department of Computer Science Carle Illinois College of Medicine University of Illinois Urbana-Champaign Urbana, IL 61801 jimeng@illinois.edu
Pseudocode	Yes	Algorithm 1 Procedure for Property Prediction Training:
Open Source Code	Yes	The code associated with the paper is available at https://github.com/siddharthal/Co Drug/
Open Datasets	Yes	Datasets: We use four binary classification datasets for toxicity prediction (AMES, Tox21, Clin Tox) and activity prediction (HIV activity), obtained from TDC [28].
Dataset Splits	Yes	Splitting Ratio: The datasets are split in the ratio of 70:15:15, for training, calibration and testing the CP model.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU/CPU models, memory, or specific cloud computing instances used for running the experiments.
Software Dependencies	No	The paper mentions software like PyTorch, PyTorch Lightning, ADAM, DGL-Life Sci, RDKit, and Deep Chem, but does not provide specific version numbers for these software components.
Experiment Setup	Yes	Training hyperparameters: We train the model using the Py Torch Lightning Framework for training. We use the ADAM Optimizer [35]. The batch size is set to 64, and the learning rate is set to 0.001. Architecture Details: The model architecture consists of a GNN layer (Attentive FP [1]), , a readout layer, 2 hidden FCNN layers, and an output layer. The hidden state size in GNN is set to 512 dimensions. The linear layers have 256, and 8 dimensions respectively. Energy Regularization hyperparameters : The parameters min and mout in Eq. (15) are set to -5 and -35 respectively, and the parameter λ in Eq. (16) is set to 0.01.