Graph Condensation for Graph Neural Networks
Authors: Wei Jin, Lingxiao Zhao, Shichang Zhang, Yozen Liu, Jiliang Tang, Neil Shah
ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments have demonstrated the effectiveness of the proposed framework in condensing different graph datasets into informative smaller graphs. |
| Researcher Affiliation | Collaboration | Wei Jin Michigan State University jinwei2@msu.edu Lingxiao Zhao Carnegie Mellon University lingxiao@cmu.edu Shichang Zhang UCLA shichang@cs.ucla.edu Yozen Liu Snap Inc. yliu2@snap.com Jiliang Tang Michigan State University tangjili@msu.edu Neil Shah Snap Inc. nshah@snap.com |
| Pseudocode | Yes | The detailed algorithm can be found in Algorithm 1 in Appendix B. |
| Open Source Code | Yes | Code is released at https://github.com/Chandler Bang/GCond. |
| Open Datasets | Yes | We evaluate the condensation performance of the proposed framework on three transductive datasets, i.e., Cora, Citeseer (Kipf & Welling, 2017) and Ogbn-arxiv (Hu et al., 2020), and two inductive datasets, i.e., Flickr (Zeng et al., 2020) and Reddit (Hamilton et al., 2017). Since all the datasets have public splits, we download them from Py Torch Geometric (Fey & Lenssen, 2019) and use those splits throughout the experiments. |
| Dataset Splits | Yes | Dataset statistics are shown in Table 6. The first three are transductive datasets and the last two are inductive datasets. ... Cora 2,708 Nodes ... Training/Validation/Test 140/500/1000 ... Citeseer 3,327 Nodes ... Training/Validation/Test 120/500/1000 ... |
| Hardware Specification | Yes | The running time of 50 epochs on one single A100-SXM4 GPU is reported in Table 10. |
| Software Dependencies | No | The paper mentions using 'Py Torch Geometric' for datasets but does not specify its version or any other software dependencies with their version numbers. |
| Experiment Setup | Yes | Hyperparameter settings. As our goal is to generate highly informative synthetic graphs which can benefit GNNs, we choose one representative model, GCN (Kipf & Welling, 2017), for performance evaluation. For the GNN used in condensation, i.e., the GNNθ( ) in Eq. (8), we adopt SGC (Wu et al., 2019a) which decouples the propagation and transformation process but still shares similar graph filtering behavior as GCN. Unless otherwise stated, we use 2-layer models with 256 hidden units. The weight decay and dropout for the models are set to 0 in condensation process. More details for hyper-parameter tuning can be found in Appendix A. |