Efficient and Scalable Graph Generation through Iterative Local Expansion

Authors: Andreas Bergmeister, Karolis Martinkus, Nathanaƫl Perraudin, Roger Wattenhofer

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments show that our model achieves state-of-the-art performance on well-established benchmark datasets while successfully scaling to graphs with at least 5000 nodes. Our method is also the first to successfully extrapolate to graphs outside of the training distribution, showcasing a much better generalization capability over existing methods.
Researcher Affiliation Collaboration Andreas Bergmeister ETH Z urich Karolis Martinkus Prescient Design Nathana el Perraudin SDSC, ETH Z urich Roger Wattenhofer DISCO, ETH Z urich
Pseudocode Yes Algorithm 1 Random Coarsening Sequence Sampling: This algorithm demonstrates the process of Random Coarsening Sequence Sampling, detailing how a coarsening sequence is sampled for a given graph. Algorithm 2 Randomized Greedy Min-Cost Partitioning: This algorithm represents a Randomized Greedy Min-Cost Partitioning, where the cheapest contraction set is selected iteratively from the candidate sets. Algorithm 3 SDE Sampling: This describes the sampling procedure by simulating the backward process using a given denoising model. Algorithm 4 Diffusion loss: This describes the loss computation for given node and edge features vl and el.
Open Source Code Yes Our implementation is available1. 1https://github.com/AndreasBergmeister/graph-generation
Open Datasets Yes We utilize the following synthetic and real-world datasets for our experimental evaluation: Planar graphs: This dataset from Martinkus et al. (2022) comprises 200 planar graphs with 64 nodes. SBM (Stochastic Block Model) graphs: We obtain this dataset from Martinkus et al. (2022), consisting of 200 graphs with 2 to 5 communities. Tree graphs: We generate 200 random trees, using Network X (Hagberg et al., 2008), each containing 64 nodes. Protein graphs: Dobson & Doig (2003) provide a dataset of protein graph representations. Point cloud graphs: We adopt the point cloud dataset used in Liao et al. (2020), which consists of 41 point clouds representing household objects (Neumann et al., 2013).
Dataset Splits Yes For consistency, we employ the train/test split method proposed by Martinkus et al. (2022). Specifically, we allocate 20% of the graphs for testing purposes and then partition the remaining data into 80% for training and 20% for validation. Table 6: Dataset statistics. (Includes columns for Train, Val., Test with absolute numbers for each dataset)
Hardware Specification Yes Experiments were conducted using a single NVIDIA GeForce RTX A6000 GPU.
Software Dependencies No The paper mentions using 'PyTorch Geometric framework (Fey & Lenssen, 2019)' but does not provide specific version numbers for PyTorch Geometric, PyTorch, or any other critical software dependencies, which is necessary for a reproducible description.
Experiment Setup Yes We train our model using the Adam optimizer (Kingma & Ba, 2017) and an initial learning rate of 10-4. A comprehensive summary of the additional hyperparameters employed for our model, as well as the baselines against which we compare, can be found in Table 7.