Graph Distillation with Eigenbasis Matching
Authors: Yang Liu, Deyu Bo, Chuan Shi
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments demonstrate that GDEM outperforms state-of-the-art GD methods with powerful crossarchitecture generalization ability and significant distillation efficiency. Our code is available at https://github.com/liuyang-tian/GDEM. 6. Experiments In this section, we conduct experiments on a variety of graph datasets to validate the effectiveness, generalization, and efficiency of the proposed GDEM. |
| Researcher Affiliation | Academia | 1Department of Computer Science, Beijing University of Posts and Telecommunication, Beijing, China. |
| Pseudocode | Yes | Algorithm 1 GDEM for Graph Distillation |
| Open Source Code | Yes | Our code is available at https://github.com/liuyang-tian/GDEM. |
| Open Datasets | Yes | Datasets. To evaluate the effectiveness of our GDEM, we select seven representative graph datasets, including five homophilic graphs, i.e., Citeseer, Pubmed (Kipf & Welling, 2017), Ogbn-arxiv (Hu et al., 2020), Filckr (Zeng et al., 2020), and Reddit (Hamilton et al., 2017), and two heterophilic graphs, i.e., Squirrel (Rozemberczki et al., 2021) and Gamers (Lim et al., 2021). ... Resources. The address and licenses of all datasets are as follows: Citeseer: https://github.com/kimiyoung/planetoid (MIT License) Pubmed: https://github.com/kimiyoung/planetoid (MIT License) Ogbn-arxiv: https://github.com/snap-stanford/ogb (MIT License) Flickr: https://github.com/GraphSAINT/GraphSAINT (MIT License) Reddit: https://github.com/williamleif/GraphSAGE (MIT License) Squirrel: https://github.com/benedekrozemberczki/MUSAE (GPL-3.0 license) Gamers: https://github.com/benedekrozemberczki/datasets (MIT License) |
| Dataset Splits | Yes | Evaluation Protocol. To fairly evaluate the quality of synthetic graphs, we perform the following two steps for all methods: (1) Distillation step, where we apply the distillation methods in the training set of the real graphs. (2) Evaluation step, where we train GNNs on the synthetic graph from scratch and then evaluate their performance on the test set of real graphs. ... A.5. Statistics of Datasets ... Training/Validation/Test |
| Hardware Specification | Yes | A. D. General Settings ... CPU information: Intel(R) Xeon(R) Platinum 8358 CPU @ 2.60GHz GPU information: NVIDIA A800 80GB PCIe |
| Software Dependencies | No | The paper mentions the operating system version: "Linux version: 5.15.0-91-generic" and "Operating system: Ubuntu 22.04.3 LTS". However, it does not specify software dependencies like programming languages (e.g., Python), machine learning frameworks (e.g., PyTorch, TensorFlow), or other libraries with their specific version numbers. |
| Experiment Setup | Yes | Settings and Hyperparameters. To eliminate randomness, in the distillation step, we run the distillation methods 10 times and yield 10 synthetic graphs. Moreover, we set K1 + K2 = N. To reduce the tuning complexity, we treat rk = {0.8, 0.85, 0.9, 0.95, 1.0} as a hyperparameter and set K1 = rk N, K2 = (1 rk)N for eigenbasis matching. In the evaluation step, spatial GNNs have two aggregation layers and the polynomial order of spectral GNNs is set to 10. For more details, see Appendix A.8. ... Appendix A.8. Hyperparamters ... Table 11. Hyper-parameters of GDEM. Dataset Ratio epochs K1 K2 τ1 τ2 α β γ lr feat lr eigenvecs |