Boosting Graph Neural Networks via Adaptive Knowledge Distillation

Authors: Zhichun Guo, Chunhui Zhang, Yujie Fan, Yijun Tian, Chuxu Zhang, Nitesh V. Chawla

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments have demonstrated the effectiveness of BGNN. In particular, we achieve up to 3.05% improvement for node classification and 6.35% improvement for graph classification over vanilla GNNs. We conduct extensive experimental studies on both tasks, and the results demonstrate the superior performance of BGNN compared with a set of baseline methods.
Researcher Affiliation Academia 1University of Notre Dame, Notre Dame, IN 46556 2Brandeis University, Waltham, MA 02453 3Case Western Reserve University, Cleveland, OH 44106
Pseudocode No The paper describes the model and its components in text and mathematical formulas but does not include any pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any explicit statements about releasing source code or links to a code repository for the described methodology.
Open Datasets Yes We use seven datasets to conduct graph classification and node classification experiments. We follow the data split as in original papers (Sen et al. 2008; Namata et al. 2012) for Cora, Citeseer and Pubmed while the remaining datasets are randomly split using an empirical ratio. The more details are in Section B of Appendix.
Dataset Splits Yes We follow the data split as in original papers (Sen et al. 2008; Namata et al. 2012) for Cora, Citeseer and Pubmed while the remaining datasets are randomly split using an empirical ratio. The more details are in Section B of Appendix.
Hardware Specification No The paper mentions 'Implementation and experiment details are shown in Section C of Appendix' but does not provide any specific hardware details in the main text.
Software Dependencies No The paper mentions 'Implementation and experiment details are shown in Section C of Appendix' but does not provide specific software dependencies with version numbers in the main text.
Experiment Setup Yes For all the GNN backbones, we use two-layer models. For the single teacher setting, we use one GNN as the teacher and a different GNN as the student. For the multiple teacher setting, we permute the three GNNs in six orders, where the first two serve as teachers and the third one is the student. During the training, we clamp the adaptive temperature in the range from 1 to 4, which is a commonly used temperature range in KD.