Disentangled Graph Self-supervised Learning for Out-of-Distribution Generalization

Authors: Haoyang Li, Xin Wang, Zeyang Zhang, Haibo Chen, Ziwei Zhang, Wenwu Zhu

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on real-world datasets demonstrate the superiority of our model against state-of-the-art baselines under distribution shifts for graph classification tasks.
Researcher Affiliation Academia 1Department of Computer Science and Technology, BNRIST, Tsinghua University, Beijing, China.
Pseudocode No The paper does not contain a pseudocode block or a clearly labeled algorithm.
Open Source Code No The paper does not include any statement or link indicating the availability of its source code.
Open Datasets Yes Datasets. We adopt real-world benchmark datasets for the graph classification task, including the datasets from graph OOD generalization benchmark GOOD (Gui et al., 2022) and the datasets from Open Graph benchmark (Hu et al., 2020).
Dataset Splits Yes Note that all of the datasets consist of two data split strategies to create different distribution shifts except for CM-NIST, following the well-established settings (Gui et al., 2022; Sui et al., 2023).
Hardware Specification No The paper mentions time complexity but does not specify any hardware details like GPU/CPU models or memory used for experiments.
Software Dependencies No The paper mentions the use of "The Adam optimizer (Kingma & Ba, 2014)" but does not provide specific version numbers for any software or libraries.
Experiment Setup Yes The number of epochs for pretraining and finetuning is chosen from {50, 100, 200}. The Adam optimizer (Kingma & Ba, 2014) is adopted for gradient descent. The evaluation metric is accuracy for Motif and CMNIST datasets and ROC-AUC for Molbbbp and Molhiv datasets. The dimensionality of the representations d is chosen from [128, 256, 512]. The invariance regularizer coefficient λ is chosen from {10 4, 10 2, 100}. The number of disentangled channels K is chosen from {2, 3, 4, 5}. We report mean results and standard deviations of ten runs.