Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Spatial Graph Attention and Curiosity-driven Policy for Antiviral Drug Discovery
Authors: Yulun Wu, Nicholas Choma, Andrew Deru Chen, Mikaela Cashman, Erica Teixeira Prates, Veronica G Melesse Vergara, Manesh B Shah, Austin Clyde, Thomas Brettin, Wibe Albert de Jong, Neeraj Kumar, Martha S Head, Rick L. Stevens, Peter Nugent, Daniel A Jacobson, James B Brown
ICLR 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In experiments, our framework achieved outstanding results compared to state-of-the-art algorithms, while reducing the complexity of paths to chemical synthesis. |
| Researcher Affiliation | Collaboration | University of California, Berkeley, National Virtual Biotechnology Laboratory, US Department of Energy, Lawrence Berkeley National Laboratory, Oak Ridge National Laboratory, || University of Tennessee, Knoxville, University of Chicago, Argonne National Laboratory, Pacific Northwest National Laboratory |
| Pseudocode | No | The paper contains mathematical equations and descriptions but does not include structured pseudocode or algorithm blocks with labels like 'Algorithm X' or 'Pseudocode'. |
| Open Source Code | Yes | To see samples of molecules generated by DGAPN in evaluation, visit our repository https://github.com/yulun-rayn/DGAPN. |
| Open Datasets | Yes | Dataset For the models/settings that do require a dataset, we used a set of SMILES IDs taken from more than six million compounds from the MCULE molecular library a publicly available dataset of purchasable molecules (Kiss et al., 2012), and their docking scores for the NSP15 target. |
| Dataset Splits | No | The paper mentions 'validation loss' in Figure 3 but does not provide specific percentages, sample counts, or explicit methodology for training/validation/test dataset splits. |
| Hardware Specification | Yes | Structural information about the putative protein-ligand complexes was integrated into this framework with Auto Dock-GPU (Santos-Martins et al., 2021), which leverages the GPU resources from leadership-class computing facilities, including the Summit supercomputer, for high-throughput molecular docking (Le Grand et al., 2020). |
| Software Dependencies | No | The paper mentions software like Pytorch-Geometric, Auto Dock-GPU, ADADELTA, and RDKit, but does not specify version numbers for these software dependencies. |
| Experiment Setup | Yes | Based on a parameter sweep, we set number of GNN layers to be 3, MLP layers to be 3, with 3 of the GNN layers and 0 of the MLP layers shared between query and key. Number of layers in RND is set to 1; all numbers of hidden neurons 256; learning rate for actor 2 3, for critic 1 4, for RND 2 3; update time steps (i.e. batch size) 300. Number of epochs per iteration and clipping parameter ϵ for PPO are 30 and 0.1. Output dimensions and clipping parameter η for RND are 8 and 5. In evaluation mode, we use arg max policy instead of sampling policy, expand the number of candidates per step from 15-20 to 128 and expand the maximum time steps per episode from 12 to 20 compared to training. |