Graph Diffusion Policy Optimization
Authors: Yijing Liu, Chao Du, Tianyu Pang, Chongxuan LI, Min Lin, Wei Chen
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results show that GDPO achieves state-of-the-art performance in various graph generation tasks with complex and diverse objectives. Code is available at https://github.com/sail-sg/GDPO. ... Our paper does not propose any theoretical results. |
| Researcher Affiliation | Collaboration | Yijing Liu 1, Chao Du 2, Tianyu Pang2, Chongxuan Li3, Min Lin2, Wei Chen 1 1State Key Lab of CAD&CG, Zhejiang University 2Sea AI Lab, Singapore 3Renmin University of China |
| Pseudocode | Yes | Algorithm 1: Graph Diffusion Policy Optimization |
| Open Source Code | Yes | Code is available at https://github.com/sail-sg/GDPO. ... In the supplementary materials, we provide the code, dataset, and instructions for reproduction. |
| Open Datasets | Yes | We evaluate GDPO on two benchmark datasets: SBM (200 nodes) and Planar (64 nodes), each consisting of 200 graphs. ... ZINC250k [27] and MOSES [49]. ... In the supplementary materials, we provide the code, dataset, and instructions for reproduction. |
| Dataset Splits | Yes | We set T = 1000, |T | = 200, and N = 100. The number of trajectory samples K is 64 for SBM and 256 for Planar. ... During fine-tuning, we keep all layers fixed except for attention, set the learning rate to 0.00001, and utilize gradient clipping to limit the gradient norm to be less than or equal to 1. ... All statistical results are obtained by repeating the experiment five times, and the corresponding standard deviations are provided. |
| Hardware Specification | Yes | We conducted all experiments on a single A100 GPU with 40GB of VRAM and an AMD EPYC 7352 24-core Processor. |
| Software Dependencies | No | QED and SA scores are computed using the RDKit library. However, no specific version number for RDKit or any other software dependency is provided, which is required for reproducibility. |
| Experiment Setup | Yes | We set T = 1000, |T | = 200, and N = 100. The number of trajectory samples K is 64 for SBM and 256 for Planar. ... set the learning rate to 0.00001, and utilize gradient clipping to limit the gradient norm to be less than or equal to 1. |