Graph Invariant Learning with Subgraph Co-mixup for Out-of-Distribution Generalization

Authors: Tianrui Jia, Haoyang Li, Cheng Yang, Tao Tao, Chuan Shi

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on both synthetic and realworld datasets demonstrate that our method significantly outperforms state-of-the-art under various distribution shifts. We conducted extensive experiments on three artificially synthesized datasets and nine real-world datasets to verify the effectiveness of our proposed method for various types of distribution shifts.
Researcher Affiliation Collaboration Tianrui Jia1, Haoyang Li2, Cheng Yang1 , Tao Tao3, Chuan Shi1* 1Beijing University of Posts and Telecommunications 2Tsinghua University 3China Mobile Information Technology Co. Ltd. {jiatianrui, yangcheng, shichuan}@bupt.edu.cn, lihy18@mails.tsinghua.edu.cn, taotao@chinamobile.com
Pseudocode No The paper describes the proposed method using text and mathematical equations, and provides a figure showing the overall framework, but it does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code Yes 1Code available at https://github.com/BUPT-GAMMA/IGM
Open Datasets Yes For synthetic datasets, following DIR (Wu et al. 2021), we use the SPMotif dataset to evaluate our method on structure and degree shift. For real-world datasets, we examine degree shift, size shift, and other distribution shifts. For the degree shift, we employ the Graph-SST5 and Graph-Twitter datasets (Chen et al. 2022a; Yuan et al. 2022; Dong et al. 2014; Socher et al. 2013). To evaluate size shift, we utilize PROTEINS and DD datasets from TU benchmarks (Morris et al. 2020), adhering to the data split as suggested by previous research (Chen et al. 2022a). We also consider the Drug OOD (Ji et al. 2022) and Open Graph Benchmark (OGB) (Hu et al. 2020b) for structural distribution shifts.
Dataset Splits Yes To evaluate size shift, we utilize PROTEINS and DD datasets from TU benchmarks (Morris et al. 2020), adhering to the data split as suggested by previous research (Chen et al. 2022a). The paper also mentions 'training and test data' implying standard splits.
Hardware Specification No The paper does not specify any hardware details (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies No The paper does not provide specific version numbers for any software dependencies or libraries (e.g., Python, PyTorch, TensorFlow).
Experiment Setup Yes The implementation details are given in the Appendix. We conduct experiments on Drug OOD to examine our model s sensitivity to hyper-parameters. We select three critical parameters of the model, including the IRM weight γ, V-REx weight µ, invariant Mixup weight δ. We vary γ, µ and δ in {0.1, 0.5, 1, 2, 4}.