Graph Generative Model for Benchmarking Graph Neural Networks
Authors: Minji Yoon, Yue Wu, John Palowitch, Bryan Perozzi, Russ Salakhutdinov
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on real-world graphs with a diverse set of GNN models demonstrate CGT provides significant improvement over existing generative models in terms of benchmark effectiveness (up to 1.03 higher Spearman correlations, up to 33% lower MSE between original and reproduced GNN accuracies), scalability (up to 35k nodes and 8k node attributes), and privacy guarantees (k-anonymity and differential privacy for node attributes). |
| Researcher Affiliation | Collaboration | 1Carnegie Mellon University 2Google Research. |
| Pseudocode | No | The paper describes its methods through text and diagrams (e.g., Figure 2 and Figure 3) but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks. |
| Open Source Code | Yes | Our code is publicly available 1. |
| Open Datasets | Yes | We evaluate on seven public datasets three citation networks (Cora, Citeseer, and Pubmed) (Sen et al., 2008), two co-purchase graphs (Amazon Computer and Amazon Photo) (Shchur et al., 2018), and two co-authorship graph (MS CS and MS Physic) (Shchur et al., 2018). |
| Dataset Splits | Yes | For GNN training, we split 50%/10%/40% of each dataset into the training/validation/test sets, respectively. |
| Hardware Specification | Yes | All experiments were conducted on the same p3.2xlarge Amazon EC2 instance. We run CGT on 4 NVIDIA TITAN X GPUs with 12 GB memory size with sampling number 5 and K = 30 for K-anonymity. |
| Software Dependencies | No | The paper mentions using 'Google s differential privacy libraries' and 'Opacus' for DP K-means and DP-SGD, but it does not provide specific version numbers for these libraries or other software dependencies. |
| Experiment Setup | Yes | For our Computation Graph Transformer model, we use 3-layered transformers for Cora, Citeseer, Pubmed, and Amazon Computer, 4-layered transformers for Amazon Photo and MS CS, and 5-layered transformers for MS Physic, considering each graph size. For all experiments to examine the benchmark effectiveness of our model in Section 5.4, we sample s = 5 neighbors per node. For graph statistics shown in Section 5.3, we sample s = 20 neighbors per node. |