Federated Self-Explaining GNNs with Anti-shortcut Augmentations

Authors: Linan Yue, Qi Liu, Weibo Gao, Ye Liu, Kai Zhang, Yichao Du, Li Wang, Fangzhou Yao

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on real-world benchmarks and synthetic datasets validate the effectiveness of Fed GR under the FL scenarios.
Researcher Affiliation Collaboration 1State Key Laboratory of Cognitive Intelligence, University of Science and Technology of China, Hefei, China. 2Institute of Artificial Intelligence, Hefei Comprehensive National Science Center, Hefei, China. 3Byte Dance, Hangzhou, China.
Pseudocode Yes The overall training algorithm of Fed GR with anti-shortcut augmentations is presented in Algorithm 1.
Open Source Code Yes Code is available at https://github.com/ yuelinan/Codes-of-Fed GR.
Open Datasets Yes For real-world datasets, we utilize the Open Graph Benchmark (OGB) (Hu et al., 2020) as datasets, including Mol HIV, Mol Tox Cast, Mol BBBP, and Mol SIDER.
Dataset Splits Yes To ensure a fair evaluation, we first adopt the default scaffold splitting method in OGB to partition the datasets into training, validation, and test sets. Then, we employ the LDA the further distribute the training set to 4 clients with γ = 4, where all clients share the same test set.
Hardware Specification Yes All methods, including the Fed GR approach and other baselines, are trained on a single A100 GPU with 5 different random seeds.
Software Dependencies No During the training process, we employ the Adam optimizer (Kingma & Ba, 2014)... The paper mentions using Adam optimizer but does not specify software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes In all experimental settings, the values of the hyperparameters λsp, λc, λe and λd are uniformly set to 0.01, 1.0, 1.0 and 1.0, respectively. The hidden dimensionality d is 32 for the Spurious-Motif dataset, and 128 for the OGB dataset. The original node feature dimensionality dg is 4 for the Spurious-Motif dataset, and 9 for the OGB dataset. During the training process, we employ the Adam optimizer (Kingma & Ba, 2014) with a learning rate initialized as 1e-2 for the Spurious-Motif, and 1e-3 for the OGB dataset. We set the predefined sparsity α as 0.1 for Mol HIV, 0.5 for Mol SIDER, Mol Tox Cast and Mol BBBP, and 0.4 for other datasets. The communication round T is 20 and the epoch in each communication is 10, for a total of 200 iterations.