Federated Self-Explaining GNNs with Anti-shortcut Augmentations
Authors: Linan Yue, Qi Liu, Weibo Gao, Ye Liu, Kai Zhang, Yichao Du, Li Wang, Fangzhou Yao
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on real-world benchmarks and synthetic datasets validate the effectiveness of Fed GR under the FL scenarios. |
| Researcher Affiliation | Collaboration | 1State Key Laboratory of Cognitive Intelligence, University of Science and Technology of China, Hefei, China. 2Institute of Artificial Intelligence, Hefei Comprehensive National Science Center, Hefei, China. 3Byte Dance, Hangzhou, China. |
| Pseudocode | Yes | The overall training algorithm of Fed GR with anti-shortcut augmentations is presented in Algorithm 1. |
| Open Source Code | Yes | Code is available at https://github.com/ yuelinan/Codes-of-Fed GR. |
| Open Datasets | Yes | For real-world datasets, we utilize the Open Graph Benchmark (OGB) (Hu et al., 2020) as datasets, including Mol HIV, Mol Tox Cast, Mol BBBP, and Mol SIDER. |
| Dataset Splits | Yes | To ensure a fair evaluation, we first adopt the default scaffold splitting method in OGB to partition the datasets into training, validation, and test sets. Then, we employ the LDA the further distribute the training set to 4 clients with γ = 4, where all clients share the same test set. |
| Hardware Specification | Yes | All methods, including the Fed GR approach and other baselines, are trained on a single A100 GPU with 5 different random seeds. |
| Software Dependencies | No | During the training process, we employ the Adam optimizer (Kingma & Ba, 2014)... The paper mentions using Adam optimizer but does not specify software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | In all experimental settings, the values of the hyperparameters λsp, λc, λe and λd are uniformly set to 0.01, 1.0, 1.0 and 1.0, respectively. The hidden dimensionality d is 32 for the Spurious-Motif dataset, and 128 for the OGB dataset. The original node feature dimensionality dg is 4 for the Spurious-Motif dataset, and 9 for the OGB dataset. During the training process, we employ the Adam optimizer (Kingma & Ba, 2014) with a learning rate initialized as 1e-2 for the Spurious-Motif, and 1e-3 for the OGB dataset. We set the predefined sparsity α as 0.1 for Mol HIV, 0.5 for Mol SIDER, Mol Tox Cast and Mol BBBP, and 0.4 for other datasets. The communication round T is 20 and the epoch in each communication is 10, for a total of 200 iterations. |