GraphAF: a Flow-based Autoregressive Model for Molecular Graph Generation

Authors: Chence Shi*, Minkai Xu*, Zhaocheng Zhu, Weinan Zhang, Ming Zhang, Jian Tang

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results show that Graph AF is able to generate 68% chemically valid molecules even without chemical knowledge rules and 100% valid molecules with chemical rules. The training process of Graph AF is two times faster than the existing state-of-the-art approach GCPN. After fine-tuning the model for goal-directed property optimization with reinforcement learning, Graph AF achieves state-of-the-art performance on both chemical property optimization and constrained property optimization.
Researcher Affiliation Academia 1Department of Computer Science, Peking University, China 2Shanghai Jiao Tong University, China 3Mila Qu ebec AI Institute, Canada 4Universit e de Montr eal, Canada 5HEC Montr eal, Canada 6CIFAR AI Research Chair
Pseudocode Yes We summarize the detailed training algorithm into Appendix B. Algorithm 1 Parallel Training Algorithm of Graph AF
Open Source Code Yes Code is available at https://github.com/Deep Graph Learning/Graph AF
Open Datasets Yes We use the ZINC250k molecular dataset (Irwin et al., 2012) for training.
Dataset Splits No The paper mentions using the ZINC250k dataset for training and evaluates metrics on generated molecules, but it does not specify explicit train/validation/test dataset splits (e.g., percentages or counts for each split) or refer to standard predefined splits for reproducibility.
Hardware Specification Yes To achieve the results in Table 2, JT-VAE and GCPN take around 24 and 8 hours, respectively, while Graph AF only takes 4 hours. a machine with 1 Tesla V100 GPU and 32 CPU cores.
Software Dependencies No Graph AF is implemented in Py Torch (Paszke et al., 2017). We use the open-source chemical software RDkit (Landrum, 2016) to preprocess molecules. We use Adam (Kingma & Ba, 2014) to optimize our model. The paper mentions software and frameworks but does not provide specific version numbers for PyTorch or RDkit.
Experiment Setup Yes The R-GCN is implemented with 3 layers, and the embedding dimension is set as 128. The max graph size is set as 48 empirically. For density modeling, we train our model for 10 epochs with a batch size of 32 and a learning rate of 0.001. We use Adam (Kingma & Ba, 2014) to optimize our model. gamma is set to 0.97 for QED optimization and 0.9 for penalized log P optimization respectively. We fine-tune the pretrained model for 200 iterations with a fixed batch size of 64 using Adam optimizer. We also adopt a linear learning rate warm-up to stabilize the training. We use Adam with a learning rate of 0.0001 to optimize the model.