Graph Condensation for Graph Neural Networks

Authors: Wei Jin, Lingxiao Zhao, Shichang Zhang, Yozen Liu, Jiliang Tang, Neil Shah

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments have demonstrated the effectiveness of the proposed framework in condensing different graph datasets into informative smaller graphs.
Researcher Affiliation Collaboration Wei Jin Michigan State University jinwei2@msu.edu Lingxiao Zhao Carnegie Mellon University lingxiao@cmu.edu Shichang Zhang UCLA shichang@cs.ucla.edu Yozen Liu Snap Inc. yliu2@snap.com Jiliang Tang Michigan State University tangjili@msu.edu Neil Shah Snap Inc. nshah@snap.com
Pseudocode Yes The detailed algorithm can be found in Algorithm 1 in Appendix B.
Open Source Code Yes Code is released at https://github.com/Chandler Bang/GCond.
Open Datasets Yes We evaluate the condensation performance of the proposed framework on three transductive datasets, i.e., Cora, Citeseer (Kipf & Welling, 2017) and Ogbn-arxiv (Hu et al., 2020), and two inductive datasets, i.e., Flickr (Zeng et al., 2020) and Reddit (Hamilton et al., 2017). Since all the datasets have public splits, we download them from Py Torch Geometric (Fey & Lenssen, 2019) and use those splits throughout the experiments.
Dataset Splits Yes Dataset statistics are shown in Table 6. The first three are transductive datasets and the last two are inductive datasets. ... Cora 2,708 Nodes ... Training/Validation/Test 140/500/1000 ... Citeseer 3,327 Nodes ... Training/Validation/Test 120/500/1000 ...
Hardware Specification Yes The running time of 50 epochs on one single A100-SXM4 GPU is reported in Table 10.
Software Dependencies No The paper mentions using 'Py Torch Geometric' for datasets but does not specify its version or any other software dependencies with their version numbers.
Experiment Setup Yes Hyperparameter settings. As our goal is to generate highly informative synthetic graphs which can benefit GNNs, we choose one representative model, GCN (Kipf & Welling, 2017), for performance evaluation. For the GNN used in condensation, i.e., the GNNθ( ) in Eq. (8), we adopt SGC (Wu et al., 2019a) which decouples the propagation and transformation process but still shares similar graph filtering behavior as GCN. Unless otherwise stated, we use 2-layer models with 256 hidden units. The weight decay and dropout for the models are set to 0 in condensation process. More details for hyper-parameter tuning can be found in Appendix A.