reproducibilityindex.ai

Reinforced Genetic Algorithm for Structure-based Drug Design

Authors: Tianfan Fu, Wenhao Gao, Connor Coley, Jimeng Sun

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct thorough empirical studies on optimizing binding affinity to various disease targets and show that RGA outperforms the baselines in terms of docking scores and is more robust to random initializations.
Researcher Affiliation	Academia	Tianfan Fu1 , Wenhao Gao2 , Connor W. Coley2,3, Jimeng Sun4,5, 1Department of Computational Science and Engineering, Georgia Institute of Technology, 2Department of Chemical Engineering, Massachusetts Institute of Technology, 3Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, 4 Department of Computer Science, University of Illinois at Urbana-Champaign, 5 Carle Illinois College of Medicine, University of Illinois at Urbana-Champaign
Pseudocode	No	The paper describes the RGA process and provides a pipeline illustration in Figure 1, but it does not include pseudocode or a clearly labeled algorithm block.
Open Source Code	Yes	The code is available at https://github.com/futianfan/reinforced-genetic-algorithm.
Open Datasets	Yes	Dataset: we randomly select molecules from ZINC [56] database (around 250 thousands drug-like molecules) as 0-th generation of the genetic algorithms (RGA, Autogrow 4.0, GA+D). ZINC also serves as the training data for pretraining the model in JTVAE, REINVENT, Rationale RL, etc. We adopt Cross Docked2020 [57] dataset that contains around 22 million ligand-protein complexes as the training data for pretraining the policy neural networks, as mentioned in Section 3.3. More descriptions are available in Appendix.
Dataset Splits	No	The paper mentions using training data for pretraining and notes that "data splits" are covered in the supplementary materials, but it does not provide explicit train/validation/test dataset splits in the main text.
Hardware Specification	No	The paper mentions that the "total amount of compute and the type of resources used" are detailed in the supplementary materials, but no specific hardware specifications (e.g., GPU/CPU models) are provided in the main text.
Software Dependencies	No	The paper mentions using "Auto Dock Vina [52]" and refers to a "practical molecular optimization benchmark [15]" with associated software, but it does not provide specific version numbers for any software dependencies.
Experiment Setup	No	The paper states that "implementation details, dataset description & processing, hyperparameter tuning" are included in the Appendix, indicating that these specific experimental setup details are not in the main text.