Out-of-Distribution Generalization of Federated Learning via Implicit Invariant Relationships

Authors: Yaming Guo, Kai Guo, Xiaofeng Cao, Tieru Wu, Yi Chang

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments show that FEDIIR significantly outperforms relevant baselines in terms of out-of-distribution generalization of federated learning. We validate the effectiveness of the proposed method using two scenarios: a small number of clients and a large number of clients (limited communication).
Researcher Affiliation Academia 1School of Artificial Intelligence, Jilin University, Changchun, China. Correspondence to: Xiaofeng Cao <xiaofengcao@jlu.edu.cn>, Tieru Wu <wutr@jlu.edu.cn>.
Pseudocode Yes A standard algorithm for solving (ERM) is FEDAVG(Mc Mahan et al., 2017), whose pseudo-code is presented in Algorithm 1.
Open Source Code Yes Our code will be released at https://github.com/Yaming Guo98/Fed IIR.
Open Datasets Yes We conduct extensive experiments on four widely used datasets, including Rotated MNIST(Ghifary et al., 2015), VLCS(Fang et al., 2013), PACS(Li et al., 2017) and Office Home(Venkateswara et al., 2017).
Dataset Splits Yes Per common practice, we allocate 90% of the available data for training and 10% for validation.
Hardware Specification No The paper does not specify any particular hardware (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies No The paper mentions using "stochastic gradient descent (SGD)" but does not specify any software libraries, frameworks, or their version numbers.
Experiment Setup Yes For each dataset, we only tune hyperparameters via grid search in the scenario with a small number of clients and do not modify them for a larger number of client scenarios (see Appendix F.3). In all experiments, we train the global model using the global step-size ηg = 1 for 100 communication rounds, where the local model on the client is trained with stochastic gradient descent (SGD) for one epoch. Table 3 in Appendix F.3 specifies: local step-size ηl (e.g., 1e-2), batch size (e.g., 64), regularization strength γ (e.g., 1e-2), number of rounds T (100), global step-size ηg (1), ema υ (0.95), and seed (0, 1, 2).