Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Aggregation Mechanism Based Graph Heterogeneous Networks Distillation
Authors: Xiaobin Hong, Mingkai Lin, Xiangkai Ma, Wenzhong Li, Sanglu Lu
IJCAI 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results on 8 standard and 4 large-scale datasets demonstrate that AMEND consistently outperforms state-of-the-art distillation methods. To fully evaluate the proposed method, we conduct extensive experiments on 8 regular graph datasets and 4 large-scale graph datasets to compare with state-of-the-art methods. |
| Researcher Affiliation | Academia | Xiaobin Hong , Mingkai Lin , Xiangkai Ma , Wenzhong Li , Sanglu Lu State Key Laboratory for Novel Software Technology, Nanjing University EMAIL, EMAIL |
| Pseudocode | Yes | Algorithm 1 AMEND Algorithm Input: graph G = {V, E}, node feature matrix X, and precomputed position encoding Xpe Output: optimized parameters of the student MLP S, predict node labels ˆY. 1: Model initialization and Dataset Partitioning. 2: Pretrain the teacher model T with cross-entropy loss. 3: #Student MLP Training 4: for Epochs do 5: #Aggragation Context Preservation 6: ZT = T (X, E, Xpe), 7: ZS = S(X, Xpe); 8: #Aggregation-enhanced CKA 9: LACKA ACKA(ZT , ZS) in Eq. 7; 10: #Shared Manifold mixup 11: Zmix T = λZT + (1 λ)Z T ; 12: Zmix S = λZS + (1 λ)Z S; 13: ˆYT , ˆYS g T (ZT ), g S(ZS); 14: ˆYmix T , ˆYmix S g T (Zmix T ), g S(Zmix S ); 15: #Logit distillation 16: Llogit = Lmix + Lpred in Eq. 11; 17: #Overall loss compute 18: LS = Ltask + βLACKA + γLlogit in Eq. 12; 19: Gradient backward and model optimization. 20: end for 21: return S, ˆY |
| Open Source Code | No | The paper does not explicitly state that source code is provided or give a link to a repository. Phrases like "we release our code" or similar are not present. |
| Open Datasets | Yes | Datasets. To fully evaluate our proposed method, we use 8 public regular graph benchmarks [Yang et al., 2021], i.e. Cora, Citeseer, Pubmed, Computer, Photo, Corafull, Coauthor-CS, Coauthor-Physics, and 4 large-scale graphs [Hu et al., 2020], i.e., Ogbn-Arxiv, Aminer, Reddit, and Ogbn-Products. |
| Dataset Splits | Yes | For each dataset, we follow the dataset protocol in [Chen et al., 2023], where 6/2/2 of the nodes are used as training/validation/test sets, respectively. For the first two datasets, we randomly selected two non-overlapping 10% nodes as the validation and test sets, respectively, and doubled 1% for the last two datasets. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiment. |
| Experiment Setup | Yes | In Figure 5, we explore the sensitivity of hyper parameters β and γ in overall objective function Eq. 12 on three citation graphs. β and γ represent the contributions of the ACKA and manifold mixup logit distillation, respectively. The results indicate that the optimal performance is achieved with β = 10 and γ = 0.1. According to the definition of LACKA, its value range is [0, 1]. We monitored the values of each component of the loss function during training and found that, with β = 10, γ = 0.1, the scales of LACKA and Llogit were comparable to the task loss component Ltask, leading to optimal model convergence. |