Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

iGraphMix: Input Graph Mixup Method for Node Classification

Authors: Jongwon Jeong, Hoyeop Lee, Hyui Geon Yoon, Beomyoung Lee, Junhee Heo, Geonsoo Kim, Kim Jin Seon

ICLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We mathematically prove that training GNNs with i Graph Mix leads to better generalization performance compared to that without augmentation, and our experiments support the theoretical findings. and 6 EXPERIMENTS We compared the i Graph Mix with five graph data augmentation methods: (1) None that trains GNNs with the graph which is not applied any augmentation methods; (2) Drop Edge (Rong et al., 2020) that trains GNNs with the graph whose edges are randomly removed at each training epoch; (3) Drop Node (Feng et al., 2020) that trains GNNs with the graph whose nodes are randomly masked at each training epoch; (4) Drop Message (Fang et al., 2023) that trains GNNs with perturbing propagated messages at each training epoch; (5) M-Mixup (Wang et al., 2021) that trains GNNs by interpolating nodes hidden representations and corresponding labels.
Researcher Affiliation Industry Jongwon Jeong1 , Hoyeop Lee2, Hyui Geon Yoon2, Beomyoung Lee2, Junhee Heo2, Geonsoo Kim2, Jin Seon Kim2 KRAFTON1, NCSOFT Co.2
Pseudocode Yes A IMPLEMENTATION DETAILS OF IGRAPHMIX We provide the Py Torch-like style implementation of i Graph Mix for node classification in Algorithm 1. Algorithm 1 i Graph Mix: Pytorch-like Implementation with Torch Geometric.
Open Source Code No Refer to the appendices for further reproducibility details, such as code, hyper-parameters, and so on.
Open Datasets Yes We considered five datasets: Cite Seer, CORA, Pub Med (Sen et al., 2008), ogbn-arxiv (Hu et al., 2020), and Flickr (Mc Auley & Leskovec, 2012).
Dataset Splits Yes We followed the labeled node per class and the train/test dataset split settings for Table 1 used in Yang et al. (Yang et al., 2016). and Table 3: Datasets statistics for the transductive setting. ... # Valid Nodes
Hardware Specification Yes We conducted our experiments on the V-100 with CUDA version 11.3.
Software Dependencies Yes Our method is built on Pytorch 1.12.1. (Paszke et al., 2019) and Pytorch Geometric 2.1.0 (Fey & Lenssen, 2019).
Experiment Setup Yes For Cite Seer, CORA, and Pubmed, we used Adam Optimizer (Kingma & Ba, 2015) with 0.01 learning rate and 5e-4 weight decaying, dropout with 0.5 probability, and 16 hidden units for GCN. Also, we used Adam Optimizer with a learning rate of 0.005 and weight decaying of 5e-4, dropout of 0.5 probability, 16 hidden units, and 1 head for GAT and GATv2 (Zhao et al., 2021; Verma et al., 2021). We trained the above models by 2000 epochs and reported the test scores when the validation scores were the maximum.