Graph Generative Model for Benchmarking Graph Neural Networks

Authors: Minji Yoon, Yue Wu, John Palowitch, Bryan Perozzi, Russ Salakhutdinov

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on real-world graphs with a diverse set of GNN models demonstrate CGT provides significant improvement over existing generative models in terms of benchmark effectiveness (up to 1.03 higher Spearman correlations, up to 33% lower MSE between original and reproduced GNN accuracies), scalability (up to 35k nodes and 8k node attributes), and privacy guarantees (k-anonymity and differential privacy for node attributes).
Researcher Affiliation Collaboration 1Carnegie Mellon University 2Google Research.
Pseudocode No The paper describes its methods through text and diagrams (e.g., Figure 2 and Figure 3) but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code Yes Our code is publicly available 1.
Open Datasets Yes We evaluate on seven public datasets three citation networks (Cora, Citeseer, and Pubmed) (Sen et al., 2008), two co-purchase graphs (Amazon Computer and Amazon Photo) (Shchur et al., 2018), and two co-authorship graph (MS CS and MS Physic) (Shchur et al., 2018).
Dataset Splits Yes For GNN training, we split 50%/10%/40% of each dataset into the training/validation/test sets, respectively.
Hardware Specification Yes All experiments were conducted on the same p3.2xlarge Amazon EC2 instance. We run CGT on 4 NVIDIA TITAN X GPUs with 12 GB memory size with sampling number 5 and K = 30 for K-anonymity.
Software Dependencies No The paper mentions using 'Google s differential privacy libraries' and 'Opacus' for DP K-means and DP-SGD, but it does not provide specific version numbers for these libraries or other software dependencies.
Experiment Setup Yes For our Computation Graph Transformer model, we use 3-layered transformers for Cora, Citeseer, Pubmed, and Amazon Computer, 4-layered transformers for Amazon Photo and MS CS, and 5-layered transformers for MS Physic, considering each graph size. For all experiments to examine the benchmark effectiveness of our model in Section 5.4, we sample s = 5 neighbors per node. For graph statistics shown in Section 5.3, we sample s = 20 neighbors per node.