Molecule Generation by Principal Subgraph Mining and Assembling
Authors: Xiangzhe Kong, Wenbing Huang, Zhixing Tan, Yang Liu
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct extensive experiments on the ZINC250K [16] and QM9 [6, 37] datasets. Results demonstrate that our PS-VAE outperforms state-of-the-art models on distribution learning, (constrained) property optimization as well as Guaca Mol goal-directed benchmarks [7]. |
| Researcher Affiliation | Academia | Xiangzhe Kong1 Wenbing Huang4,5 Zhixing Tan 1 Yang Liu1,2,3 1Dept. of Comp. Sci. & Tech., Institute for AI, BNRist Center, Tsinghua University 2Institute for AIR, Tsinghua University 3Beijing Academy of Artificial Intelligence 4Gaoling School of Artificial Intelligence, Renmin University of China 5 Beijing Key Laboratory of Big Data Management and Analysis Methods, Beijing, China Jackie_KXZ@outlook.com, hwenbing@126.com, {zxtan, liuyang2011}@tsinghua.edu.cn |
| Pseudocode | Yes | Algorithm 1 Principal Subgraph Extraction |
| Open Source Code | Yes | 2Codes for our PS-VAE are availabel at https://github.com/THUNLP-MT/PS-VAE. |
| Open Datasets | Yes | We use the ZINC250K [16] dataset for training, which contains 250,000 drug-like molecules up to 38 atoms. For Guaca Mol benchmark, we add extra results on the QM9 [6, 37] dataset, which has 133,014 molecules up to 23 atoms. |
| Dataset Splits | No | The paper states it uses ZINC250K for training and QM9 for Guaca Mol benchmarks, but does not explicitly provide specific training/validation/test dataset splits in percentages or sample counts in the main text. |
| Hardware Specification | No | The paper states that hardware specifications are in Appendix F, which is not provided in the given text. |
| Software Dependencies | No | The paper mentions software components like GNN, MLP, GRU, but does not provide specific version numbers for these or other software dependencies in the main text. It refers to Appendix G for more details, which is not provided. |
| Experiment Setup | Yes | PS-VAE is trained for 6 epochs with a batch size of 32 and a learning rate of 0.001. We set α = 0.1 and initialize β = 0. We adopt a warm-up method that increases β by 0.002 every 1000 steps to a maximum of 0.01. |