Scalable Deep Generative Modeling for Sparse Graphs

Authors: Hanjun Dai, Azade Nazi, Yujia Li, Bo Dai, Dale Schuurmans

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on several benchmarks show that the proposed approach not only scales to orders of magnitude larger graphs than previously possible with deep autoregressive graph generative models, but also yields better graph generation quality. On several benchmark datasets, including synthetic graphs and real-world graphs of proteins, 3D mesh and SAT instances, Bi GG is able to achieve comparable or superior sample quality than the previous state-of-the-art, while being orders of magnitude more scalable.
Researcher Affiliation Industry 1Google Research, Brain Team 2Deep Mind.
Pseudocode Yes Algorithm 1 Generating outgoing edges of node u. Algorithm 2 Generating graph using Bi GG.
Open Source Code Yes Please refer to our released open source code located at https://github.com/google-research/ google-research/tree/master/bigg for more implementation and experimental details.
Open Datasets Yes This benchmark has four different datasets: (1) Grid, 100 2D grid graphs; (2) Protein, 918 protein graphs (Dobson & Doig, 2003); (3) Point cloud, 3D point clouds of 41 household objects (Neumann et al., 2013); (4) Lobster, 100 random Lobster graphs (Golomb, 1996)... We use the train/test split of SAT instances obtained from G2SAT website.
Dataset Splits No The paper mentions splitting data into training and test sets but does not explicitly describe a separate validation split or its size/methodology for all experiments. For example, "We use the same protocol as Liao et al. (2019) that splits the graphs into training and test sets."
Hardware Specification No The paper mentions running models "on a single GPU" but does not specify any particular GPU model (e.g., NVIDIA A100), CPU model, or other detailed hardware specifications.
Software Dependencies No The paper does not provide specific software dependencies with version numbers (e.g., Python 3.x, PyTorch 1.x) that are required to reproduce the experiments.
Experiment Setup Yes Empirically we use L = 256 in all experiments, which saves 50% of the memory during training without losing any information in representation. Such model has ϵ probability to sample from Bernoulli distribution (as in Eq (8) (9)) each step, and 1 ϵ to pick best option otherwise.