CORE: Automatic Molecule Optimization Using Copy & Refine Strategy

Authors: Tianfan Fu, Cao Xiao, Jimeng Sun638-645

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We tested CORE and baselines using the ZINC database and CORE obtained up to 11% and 21% relatively improvement over the baselines on success rate on the complete test set and the subset with infrequent substructures, respectively.
Researcher Affiliation Collaboration Tianfan Fu,1 Cao Xiao,2 Jimeng Sun1 1College of Computing, Georgia Institute of Technology, Atlanta, USA 2Analytics Center of Excellence, IQVIA, Cambridge, USA tfu42@gatech.edu, cao.xiao@iqvia.com, jsun@cc.gatech.edu
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code Yes The code is available3. 3https://github.com/futianfan/CORE
Open Datasets Yes First, we introduce the molecule data that we are using. ZINC contains 250K drug molecules extracted from the ZINC database (Sterling and Irwin 2015).
Dataset Splits Yes Table 3: Statistics of 4 datasets, DRD2, QED, Log P04 and Log P06. ... Dataset # Training Pairs # Valid Pairs # Test ( 20)
Hardware Specification No The paper does not provide specific hardware details such as exact GPU/CPU models, processor types, or memory amounts used for running its experiments. It only discusses general aspects of the experimental setup.
Software Dependencies No The paper mentions software components like “Adam optimizer” and “multi-layer feedforward network” but does not specify any software or library names with version numbers required for replication.
Experiment Setup Yes In this section, we provide the implementation details for reproducibility, especially the setting of hyperparameters. We follow most of the hyperparameter setting of (Jin et al. 2019). For all these baseline methods and datasets, maximal epoch number is set to 10, batch size is set to 32. During encoder module, embedding size is set to 300. The depth of message passing network are set to 6 and 3 for tree and graph, respectively. The initial learning rate is set to 1e 3 with the Adam optimizer. Every epoch learning rate is annealed by 0.8.