Molecule Generation by Principal Subgraph Mining and Assembling

Authors: Xiangzhe Kong, Wenbing Huang, Zhixing Tan, Yang Liu

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct extensive experiments on the ZINC250K [16] and QM9 [6, 37] datasets. Results demonstrate that our PS-VAE outperforms state-of-the-art models on distribution learning, (constrained) property optimization as well as Guaca Mol goal-directed benchmarks [7].
Researcher Affiliation Academia Xiangzhe Kong1 Wenbing Huang4,5 Zhixing Tan 1 Yang Liu1,2,3 1Dept. of Comp. Sci. & Tech., Institute for AI, BNRist Center, Tsinghua University 2Institute for AIR, Tsinghua University 3Beijing Academy of Artificial Intelligence 4Gaoling School of Artificial Intelligence, Renmin University of China 5 Beijing Key Laboratory of Big Data Management and Analysis Methods, Beijing, China Jackie_KXZ@outlook.com, hwenbing@126.com, {zxtan, liuyang2011}@tsinghua.edu.cn
Pseudocode Yes Algorithm 1 Principal Subgraph Extraction
Open Source Code Yes 2Codes for our PS-VAE are availabel at https://github.com/THUNLP-MT/PS-VAE.
Open Datasets Yes We use the ZINC250K [16] dataset for training, which contains 250,000 drug-like molecules up to 38 atoms. For Guaca Mol benchmark, we add extra results on the QM9 [6, 37] dataset, which has 133,014 molecules up to 23 atoms.
Dataset Splits No The paper states it uses ZINC250K for training and QM9 for Guaca Mol benchmarks, but does not explicitly provide specific training/validation/test dataset splits in percentages or sample counts in the main text.
Hardware Specification No The paper states that hardware specifications are in Appendix F, which is not provided in the given text.
Software Dependencies No The paper mentions software components like GNN, MLP, GRU, but does not provide specific version numbers for these or other software dependencies in the main text. It refers to Appendix G for more details, which is not provided.
Experiment Setup Yes PS-VAE is trained for 6 epochs with a batch size of 32 and a learning rate of 0.001. We set α = 0.1 and initialize β = 0. We adopt a warm-up method that increases β by 0.002 every 1000 steps to a maximum of 0.01.