Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

AGMixup: Adaptive Graph Mixup for Semi-supervised Node Classification

Authors: Weigang Lu, Ziyu Guan, Wei Zhao, Yaming Yang, Yibing Zhan, Yiheng Lu, Dapeng Tao

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments across seven datasets on semi-supervised node classification benchmarks demonstrate AGMixup s superiority over state-of-the-art graph mixup methods. The semi-supervised node classification accuracy results presented in Table 1 are obtained from ten different runs, ensuring reliable and consistent measurements. We provide comprehensive empirical analysis to understand the behavior of AGMixup.
Researcher Affiliation	Collaboration	1School of Computer Science and Technology, Xidian University, Xi an, China 2JD Explore Academy, Xidian University, Beijing, China 3School of Information Science and Engineering, Yunnan University, Kunming, China {wglu@.stu., zyguan@, ywzhao@mail., yym@,}xidan.edu.cn, EMAIL, EMAIL
Pseudocode	No	The paper describes the methodology, including subgraph-centric mixup, contextual similarity-aware λ initialization, and uncertainty-aware λ adjustment using equations, but it does not present these steps in a structured pseudocode or algorithm block.
Open Source Code	Yes	We compare our AGMixup2 with three state-of-the-art graph mixup methods targeting at the semi-supervised node classification problem... 2https://github.com/Weigang Lu/AGMixup
Open Datasets	Yes	As for the datasets, we choose seven graphs, i.e., Cora, Citeseer, Pubmed (Yang, Cohen, and Salakhudinov 2016), Coauthor CS, and Coauthor Physics (Shchur et al. 2018), and two large-scale graphs, i.e., ogbn-arxiv and ogbn-products (Hu et al. 2020).
Dataset Splits	No	The paper mentions using common datasets for semi-supervised node classification, but it does not explicitly state the specific train/test/validation split percentages, sample counts, or the exact splitting methodology used for these datasets within the main text.
Hardware Specification	No	The paper does not provide specific details about the hardware (e.g., GPU models, CPU types, memory) used to run the experiments.
Software Dependencies	No	The paper does not provide specific version numbers for any software dependencies or libraries used in the implementation of AGMixup or the experiments.
Experiment Setup	Yes	In this section, we explore the impact of key hyperparameters on the performance of AGMixup, i.e., subgraph size (controlled by r) and sensitivity to similarity and uncertainty (controlled by γ and β, respectively). We conduct experiments on Cora and Pubmed using GCN as the backbone model. For the Cora dataset, AGMixup exhibits optimal performance at r = 3... For the Pubmed dataset, AGMixup shows improved performance as r is increased up to a moderate level (r = 5)... we recommend setting γ and β at the range of 0.5 to 2.