Graph Distillation with Eigenbasis Matching

Authors: Yang Liu, Deyu Bo, Chuan Shi

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments demonstrate that GDEM outperforms state-of-the-art GD methods with powerful crossarchitecture generalization ability and significant distillation efficiency. Our code is available at https://github.com/liuyang-tian/GDEM. 6. Experiments In this section, we conduct experiments on a variety of graph datasets to validate the effectiveness, generalization, and efficiency of the proposed GDEM.
Researcher Affiliation Academia 1Department of Computer Science, Beijing University of Posts and Telecommunication, Beijing, China.
Pseudocode Yes Algorithm 1 GDEM for Graph Distillation
Open Source Code Yes Our code is available at https://github.com/liuyang-tian/GDEM.
Open Datasets Yes Datasets. To evaluate the effectiveness of our GDEM, we select seven representative graph datasets, including five homophilic graphs, i.e., Citeseer, Pubmed (Kipf & Welling, 2017), Ogbn-arxiv (Hu et al., 2020), Filckr (Zeng et al., 2020), and Reddit (Hamilton et al., 2017), and two heterophilic graphs, i.e., Squirrel (Rozemberczki et al., 2021) and Gamers (Lim et al., 2021). ... Resources. The address and licenses of all datasets are as follows: Citeseer: https://github.com/kimiyoung/planetoid (MIT License) Pubmed: https://github.com/kimiyoung/planetoid (MIT License) Ogbn-arxiv: https://github.com/snap-stanford/ogb (MIT License) Flickr: https://github.com/GraphSAINT/GraphSAINT (MIT License) Reddit: https://github.com/williamleif/GraphSAGE (MIT License) Squirrel: https://github.com/benedekrozemberczki/MUSAE (GPL-3.0 license) Gamers: https://github.com/benedekrozemberczki/datasets (MIT License)
Dataset Splits Yes Evaluation Protocol. To fairly evaluate the quality of synthetic graphs, we perform the following two steps for all methods: (1) Distillation step, where we apply the distillation methods in the training set of the real graphs. (2) Evaluation step, where we train GNNs on the synthetic graph from scratch and then evaluate their performance on the test set of real graphs. ... A.5. Statistics of Datasets ... Training/Validation/Test
Hardware Specification Yes A. D. General Settings ... CPU information: Intel(R) Xeon(R) Platinum 8358 CPU @ 2.60GHz GPU information: NVIDIA A800 80GB PCIe
Software Dependencies No The paper mentions the operating system version: "Linux version: 5.15.0-91-generic" and "Operating system: Ubuntu 22.04.3 LTS". However, it does not specify software dependencies like programming languages (e.g., Python), machine learning frameworks (e.g., PyTorch, TensorFlow), or other libraries with their specific version numbers.
Experiment Setup Yes Settings and Hyperparameters. To eliminate randomness, in the distillation step, we run the distillation methods 10 times and yield 10 synthetic graphs. Moreover, we set K1 + K2 = N. To reduce the tuning complexity, we treat rk = {0.8, 0.85, 0.9, 0.95, 1.0} as a hyperparameter and set K1 = rk N, K2 = (1 rk)N for eigenbasis matching. In the evaluation step, spatial GNNs have two aggregation layers and the polynomial order of spectral GNNs is set to 10. For more details, see Appendix A.8. ... Appendix A.8. Hyperparamters ... Table 11. Hyper-parameters of GDEM. Dataset Ratio epochs K1 K2 τ1 τ2 α β γ lr feat lr eigenvecs