Fused Gromov-Wasserstein Graph Mixup for Graph-level Classifications

Authors: Xinyu Ma, Xu Chu, Yasha Wang, Yang Lin, Junfeng Zhao, Liantao Ma, Wenwu Zhu

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments conducted on five datasets using both classic (MPNNs) and advanced (Graphormers) GNN backbones demonstrate that FGWMixup effectively improves the generalizability and robustness of GNNs.
Researcher Affiliation Academia 1School of Computer Science, Peking University 2Department of Computer Science and Technology, Tsinghua University 3National Research and Engineering Center of Software Engineering, Peking University maxinyu@pku.edu.cn, chu_xu@mail.tsinghua.edu.cn
Pseudocode Yes Algorithm 1 FGWMixup: Solving Eq.2 with BCD Algorithm
Open Source Code Yes Codes are available at https://github.com/ArthurLeoM/FGWMixup.
Open Datasets Yes We evaluate our methods with five widely-used graph classification tasks from the graph benchmark dataset collection TUDataset [35]: NCI1 and NCI109 [36, 37] for small molecule classification, PROTEINS [38] for protein categorization, and IMDB-B and IMDB-M [5] for social networks classification.
Dataset Splits Yes For all datasets, We randomly hold out a test set comprising 10% of the entire dataset and employ 10-fold cross-validation on the remaining data. We report the average and standard deviation of the accuracy on the test set over the best models selected from the 10 folds. This setting is more realistic than reporting results from validation sets in a simple 10-fold CV and allows a better understanding of the generalizability [43].
Hardware Specification Yes The experiments in this work are conducted on two machines: one with 8 Nvidia RTX3090 GPUs and Intel Xeon E5-2680 CPUs, one with 2 Nvidia RTX8000 GPUs and Intel Xeon Gold 6230 CPUs.
Software Dependencies Yes Our experiments are implemented with Python 3.9, Py Torch 1.11.0, Deep Graph Library (DGL) [44] 1.0.2, and Python Optimal Transport (POT) [45] 0.8.2.
Experiment Setup Yes We sample the mixup weight λ from the distribution Beta(0.2, 0.2) and the trade-off coefficient of the structure and signal costs α are tuned in {0.05, 0.5, 0.95}. ... The maximum number of iterations of FGWMixupis set to 200 for the outer loop optimizing X and A, and 300 for the inner loop optimizing couplings π. The stopping criteria of our algorithm are the relative update of the optimization objective reaching below a threshold, which is set to 5e-4. For FGWMixup , we select the step size of MD γ from {0.1, 1, 10}. The mixup ratio (i.e., the proportion of mixup samples to the original training samples) is set to 0.25. For the training of GNNs, MPNNs are trained for 400 epochs and Graphormers are trained for 300 epochs, both using Adam W optimizer with a weight decay rate of 5e-4. The batch size of GNNs are chosen from {32, 128}, and the learning rate is chosen from {1e-3, 5e-4, 1e-4}. Dropout is employed with a fixed dropout rate 0.5 to prevent overfitting.