Spatial Graph Attention and Curiosity-driven Policy for Antiviral Drug Discovery

Authors: Yulun Wu, Nicholas Choma, Andrew Deru Chen, Mikaela Cashman, Erica Teixeira Prates, Veronica G Melesse Vergara, Manesh B Shah, Austin Clyde, Thomas Brettin, Wibe Albert de Jong, Neeraj Kumar, Martha S Head, Rick L. Stevens, Peter Nugent, Daniel A Jacobson, James B Brown

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In experiments, our framework achieved outstanding results compared to state-of-the-art algorithms, while reducing the complexity of paths to chemical synthesis.
Researcher Affiliation Collaboration University of California, Berkeley, National Virtual Biotechnology Laboratory, US Department of Energy, Lawrence Berkeley National Laboratory, Oak Ridge National Laboratory, || University of Tennessee, Knoxville, University of Chicago, Argonne National Laboratory, Pacific Northwest National Laboratory
Pseudocode No The paper contains mathematical equations and descriptions but does not include structured pseudocode or algorithm blocks with labels like 'Algorithm X' or 'Pseudocode'.
Open Source Code Yes To see samples of molecules generated by DGAPN in evaluation, visit our repository https://github.com/yulun-rayn/DGAPN.
Open Datasets Yes Dataset For the models/settings that do require a dataset, we used a set of SMILES IDs taken from more than six million compounds from the MCULE molecular library a publicly available dataset of purchasable molecules (Kiss et al., 2012), and their docking scores for the NSP15 target.
Dataset Splits No The paper mentions 'validation loss' in Figure 3 but does not provide specific percentages, sample counts, or explicit methodology for training/validation/test dataset splits.
Hardware Specification Yes Structural information about the putative protein-ligand complexes was integrated into this framework with Auto Dock-GPU (Santos-Martins et al., 2021), which leverages the GPU resources from leadership-class computing facilities, including the Summit supercomputer, for high-throughput molecular docking (Le Grand et al., 2020).
Software Dependencies No The paper mentions software like Pytorch-Geometric, Auto Dock-GPU, ADADELTA, and RDKit, but does not specify version numbers for these software dependencies.
Experiment Setup Yes Based on a parameter sweep, we set number of GNN layers to be 3, MLP layers to be 3, with 3 of the GNN layers and 0 of the MLP layers shared between query and key. Number of layers in RND is set to 1; all numbers of hidden neurons 256; learning rate for actor 2 3, for critic 1 4, for RND 2 3; update time steps (i.e. batch size) 300. Number of epochs per iteration and clipping parameter ϵ for PPO are 30 and 0.1. Output dimensions and clipping parameter η for RND are 8 and 5. In evaluation mode, we use arg max policy instead of sampling policy, expand the number of candidates per step from 15-20 to 128 and expand the maximum time steps per episode from 12 to 20 compared to training.