reproducibilityindex.ai

Out-of-Distribution Generalization of Federated Learning via Implicit Invariant Relationships

Authors: Yaming Guo, Kai Guo, Xiaofeng Cao, Tieru Wu, Yi Chang

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments show that FEDIIR significantly outperforms relevant baselines in terms of out-of-distribution generalization of federated learning. We validate the effectiveness of the proposed method using two scenarios: a small number of clients and a large number of clients (limited communication).
Researcher Affiliation	Academia	1School of Artificial Intelligence, Jilin University, Changchun, China. Correspondence to: Xiaofeng Cao <xiaofengcao@jlu.edu.cn>, Tieru Wu <wutr@jlu.edu.cn>.
Pseudocode	Yes	A standard algorithm for solving (ERM) is FEDAVG(Mc Mahan et al., 2017), whose pseudo-code is presented in Algorithm 1.
Open Source Code	Yes	Our code will be released at https://github.com/Yaming Guo98/Fed IIR.
Open Datasets	Yes	We conduct extensive experiments on four widely used datasets, including Rotated MNIST(Ghifary et al., 2015), VLCS(Fang et al., 2013), PACS(Li et al., 2017) and Office Home(Venkateswara et al., 2017).
Dataset Splits	Yes	Per common practice, we allocate 90% of the available data for training and 10% for validation.
Hardware Specification	No	The paper does not specify any particular hardware (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies	No	The paper mentions using "stochastic gradient descent (SGD)" but does not specify any software libraries, frameworks, or their version numbers.
Experiment Setup	Yes	For each dataset, we only tune hyperparameters via grid search in the scenario with a small number of clients and do not modify them for a larger number of client scenarios (see Appendix F.3). In all experiments, we train the global model using the global step-size ηg = 1 for 100 communication rounds, where the local model on the client is trained with stochastic gradient descent (SGD) for one epoch. Table 3 in Appendix F.3 specifies: local step-size ηl (e.g., 1e-2), batch size (e.g., 64), regularization strength γ (e.g., 1e-2), number of rounds T (100), global step-size ηg (1), ema υ (0.95), and seed (0, 1, 2).