Scalable Deep Generative Modeling for Sparse Graphs
Authors: Hanjun Dai, Azade Nazi, Yujia Li, Bo Dai, Dale Schuurmans
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on several benchmarks show that the proposed approach not only scales to orders of magnitude larger graphs than previously possible with deep autoregressive graph generative models, but also yields better graph generation quality. On several benchmark datasets, including synthetic graphs and real-world graphs of proteins, 3D mesh and SAT instances, Bi GG is able to achieve comparable or superior sample quality than the previous state-of-the-art, while being orders of magnitude more scalable. |
| Researcher Affiliation | Industry | 1Google Research, Brain Team 2Deep Mind. |
| Pseudocode | Yes | Algorithm 1 Generating outgoing edges of node u. Algorithm 2 Generating graph using Bi GG. |
| Open Source Code | Yes | Please refer to our released open source code located at https://github.com/google-research/ google-research/tree/master/bigg for more implementation and experimental details. |
| Open Datasets | Yes | This benchmark has four different datasets: (1) Grid, 100 2D grid graphs; (2) Protein, 918 protein graphs (Dobson & Doig, 2003); (3) Point cloud, 3D point clouds of 41 household objects (Neumann et al., 2013); (4) Lobster, 100 random Lobster graphs (Golomb, 1996)... We use the train/test split of SAT instances obtained from G2SAT website. |
| Dataset Splits | No | The paper mentions splitting data into training and test sets but does not explicitly describe a separate validation split or its size/methodology for all experiments. For example, "We use the same protocol as Liao et al. (2019) that splits the graphs into training and test sets." |
| Hardware Specification | No | The paper mentions running models "on a single GPU" but does not specify any particular GPU model (e.g., NVIDIA A100), CPU model, or other detailed hardware specifications. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers (e.g., Python 3.x, PyTorch 1.x) that are required to reproduce the experiments. |
| Experiment Setup | Yes | Empirically we use L = 256 in all experiments, which saves 50% of the memory during training without losing any information in representation. Such model has ϵ probability to sample from Bernoulli distribution (as in Eq (8) (9)) each step, and 1 ϵ to pick best option otherwise. |