Pard: Permutation-Invariant Autoregressive Diffusion for Graph Generation
Authors: Lingxiao Zhao, Xueying Ding, Leman Akoglu
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | PARD achieves new SOTA performance on many molecular and non-molecular datasets without any extra features, significantly outperforming Di Gress [44]. Thanks to efficient architecture and parallel training, PARD scales to large datasets like MOSES [33] with 1.9M graphs. PARD is open-sourced at https://github.com/Lingxiao Shawn/Pard |
| Researcher Affiliation | Academia | Lingxiao Zhao Carnegie Mellon University lingxiaozlx@gmail.com Xueying Ding Carnegie Mellon University xding2@andrew.cmu.edu Leman Akoglu Carnegie Mellon University lakoglu@andrew.cmu.edu |
| Pseudocode | Yes | We provide the training and inference algorithms for PARD in Apdx. A.8. Specifically, Algo. 2 is used to train next block s size prediction model; Algo. 3 is used to train the shared diffusion for block conditional probabilities; and Algo. 4 presents the generation steps. ... Algorithm 1 Structural Partial Order ϕ |
| Open Source Code | Yes | PARD is open-sourced at https://github.com/Lingxiao Shawn/Pard |
| Open Datasets | Yes | We experiment with three different molecular datasets used across the graph generation literature: (1) QM9 [34] (2) ZINC250K [23], and (3) MOSES [33] that contains more than 1.9 million graphs. ... We use five generic graph datasets with various structure and semantic: (1) COMMUNITY-SMALL [48], (2) CAVEMAN [47], (3) CORA [35], (4) BREAST [15], and (5) GRID [48]. |
| Dataset Splits | Yes | We use a 80%-20% train and test split, and among the train data we split additional 20% as validation. |
| Hardware Specification | Yes | We use a single RTX-A6000 GPU for all experiments. |
| Software Dependencies | No | We use Pytorch Geometric [14], and we implement our combination of PPGN and Transformer by referencing the code in Maron et al. [30] and Ma et al. [29]. Additionally, we use Pytorch Lightning [12] for training and keeping the code clean. (Software names are mentioned, but specific version numbers are not provided). |
| Experiment Setup | Yes | We use Adam optimizer with cosine decay learning rate scheduler to train. For diffusion and blocksize prediction, we also input the embedding of block id and node degree as additional feature... For each block s diffusion model, we set the maximum time steps to 40 without much tuning. |