Efficient and Degree-Guided Graph Generation via Discrete Diffusion Modeling

Authors: Xiaohui Chen, Jiaxing He, Xu Han, Liping Liu

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The empirical study shows that EDGE is much more efficient than competing methods and can generate large graphs with thousands of nodes. It also outperforms baseline models in generation quality: graphs generated by the proposed model have graph statistics more similar to those of training graphs.
Researcher Affiliation Academia 1Department of Computer Science, Tufts University, Medford, MA, USA. Correspondence to: Xiaohui Chen <xiaohui.chen@tufts.edu>, Li-Ping Liu <liping.liu@tufts.edu>.
Pseudocode Yes Algorithm 1 Degree-guided graph generation
Open Source Code Yes The implementation of our model is available at github.com/tufts-ml/graph-generation-EDGE.
Open Datasets Yes Datasets. We conduct experiments on both generic graph datasets and large networks. ... Community and Ego datasets (You et al., 2018) ... Polblogs (Adamic & Glance, 2005), Cora (Sen et al., 2008), Road-Minnesota (Rossi & Ahmed, 2015), and PPI (Stark et al., 2010). ... QM9 dataset (Ramakrishnan et al., 2014).
Dataset Splits Yes We follow You et al. (2018) to generate the Community and Ego datasets and use the same data splitting strategy. ... For large network generation, we do not include validation/test sets in this task.
Hardware Specification Yes We train our models on Tesla A100, Tesla V100, or NVIDIA QUADRO RTX 6000 GPU and 32 CPU cores for all experiments. ... The sampling speed reported in Figure 3 of all baselines and our approach is tested on Tesla A100 GPU.
Software Dependencies No The paper mentions using Py Torch (Paszke et al., 2019) and Py Torch Geometric (Fey & Lenssen, 2019) but does not provide specific version numbers for these software components.
Experiment Setup Yes Table 8 provides hyperparameters for Diffusion (Diffusion steps T, Noise scheduling), Optimization (Learning rate, Optimizer, weight decay, Batch size, Number of epochs/iteration), and Architecture (Number of MPBs, Hidden dimension, Activation function, Dropout rate).