Graph Convolutional Policy Network for Goal-Directed Molecular Graph Generation

Authors: Jiaxuan You, Bowen Liu, Zhitao Ying, Vijay Pande, Jure Leskovec

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results show that GCPN can achieve 61% improvement on chemical property optimization over state-of-the-art baselines while resembling known molecules, and achieve 184% improvement on the constrained property optimization task.
Researcher Affiliation Academia 1Department of Computer Science, 2Department of Chemistry, 3Department of Bioengineering Stanford University Stanford, CA, 94305
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code Yes Link to code and datasets: https://github.com/bowenliu16/rl_graph_generation
Open Datasets Yes For the molecule generation experiments, we utilize the ZINC250k molecule dataset [14] that contains 250,000 drug like commercially available molecules whose maximum atom number is 38. We use the dataset for both expert pretraining and adversarial training.
Dataset Splits No The paper mentions using a dataset for training and validation in general terms but does not specify exact train/validation/test splits like percentages or sample counts for reproducibility.
Hardware Specification No We run both deep learning baselines using their released code and allow the baselines to have wall clock running time for roughly 24 hours, while our model can get the results in roughly 8 hours. ... using 32 CPU cores.
Software Dependencies No We set up the molecule environment as an Open AI Gym environment [3] using RDKit [23] and adapt it to the ZINC250k dataset.
Experiment Setup Yes We use a 3-layer defined GCPN as the policy network with 64 dimensional node embedding in all hidden layers, and batch normalization [13] is applied after each layer. Another 3-layer GCN with the same architecture is used for discriminator training. ... we use PPO algorithm to train the RL objective with the default hyperparameters ... and the learning rate is set as 0.001. The expert pretraining objective is trained with a learning rate of 0.00025, and we do observe that adding this objective contributes to faster convergence and better performance. Both objectives are trained using Adam optimizer [19] with batch size 32.