reproducibilityindex.ai

Graph Generative Model for Benchmarking Graph Neural Networks

Authors: Minji Yoon, Yue Wu, John Palowitch, Bryan Perozzi, Russ Salakhutdinov

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on real-world graphs with a diverse set of GNN models demonstrate CGT provides significant improvement over existing generative models in terms of benchmark effectiveness (up to 1.03 higher Spearman correlations, up to 33% lower MSE between original and reproduced GNN accuracies), scalability (up to 35k nodes and 8k node attributes), and privacy guarantees (k-anonymity and differential privacy for node attributes).
Researcher Affiliation	Collaboration	1Carnegie Mellon University 2Google Research.
Pseudocode	No	The paper describes its methods through text and diagrams (e.g., Figure 2 and Figure 3) but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code	Yes	Our code is publicly available 1.
Open Datasets	Yes	We evaluate on seven public datasets three citation networks (Cora, Citeseer, and Pubmed) (Sen et al., 2008), two co-purchase graphs (Amazon Computer and Amazon Photo) (Shchur et al., 2018), and two co-authorship graph (MS CS and MS Physic) (Shchur et al., 2018).
Dataset Splits	Yes	For GNN training, we split 50%/10%/40% of each dataset into the training/validation/test sets, respectively.
Hardware Specification	Yes	All experiments were conducted on the same p3.2xlarge Amazon EC2 instance. We run CGT on 4 NVIDIA TITAN X GPUs with 12 GB memory size with sampling number 5 and K = 30 for K-anonymity.
Software Dependencies	No	The paper mentions using 'Google s differential privacy libraries' and 'Opacus' for DP K-means and DP-SGD, but it does not provide specific version numbers for these libraries or other software dependencies.
Experiment Setup	Yes	For our Computation Graph Transformer model, we use 3-layered transformers for Cora, Citeseer, Pubmed, and Amazon Computer, 4-layered transformers for Amazon Photo and MS CS, and 5-layered transformers for MS Physic, considering each graph size. For all experiments to examine the benchmark effectiveness of our model in Section 5.4, we sample s = 5 neighbors per node. For graph statistics shown in Section 5.3, we sample s = 20 neighbors per node.