Learning Personalized Causally Invariant Representations for Heterogeneous Federated Clients

Authors: Xueyang Tang, Song Guo, Jie ZHANG, Jingcai Guo

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The experimental results on diverse datasets validate the superiority of Fed SDR over the state-of-the-art PFL methods on OOD generalization performance.
Researcher Affiliation Academia Xueyang Tang1, Song Guo2 , Jie Zhang1 & Jingcai Guo1 1The Hong Kong Polytechnic University 2The Hong Kong University of Science and Technology
Pseudocode Yes Algorithm 1 Fed SDR: Federated Learning with Shortcut Discovery and Removal
Open Source Code Yes Code is available at https://github.com/Tangx-yy/Fed SDR.
Open Datasets Yes Colored-MNIST (CMNIST) (Arjovsky et al., 2019) is constructed based on MNIST (Le Cun et al., 1998), Colored Fashion-MNIST (CFMNIST) (Ahuja et al., 2020), Water Bird (Sagawa et al., 2019), PACS (Li et al., 2017).
Dataset Splits Yes The hyper-parameters of the competitors and our algorithm are tuned to make the accuracy on the validation environment (i.e., pe val = 0.10) as high as possible. and We adopt the leave-one-domain-out strategy to evaluate the OOD generalization performance.
Hardware Specification Yes We simulate a set of clients and a centralized server on one deep learning workstation (Intel(R) Core(TM) i9-12900K CPU @ 3.20GHz with one NVIDIA Ge Force RTX 3090 GPU).
Software Dependencies No The paper mentions 'Py Torch' as the implementation framework but does not provide specific version numbers for PyTorch or any other software dependencies.
Experiment Setup Yes The hyper-parameters of the competitors and our algorithm are tuned to make the accuracy on the validation environment (i.e., pe val = 0.10) as high as possible. Specifically, the mainly used hyperparameters in the evaluation part are listed as follows: Global communication round: T = 600, Local iterations: R = 10, Personalized epochs to update the personalized invariant predictors: K = 10, Local batch size: B = 50, Global learning rate: β = 0.0001, Personalized learning rate: η = 0.0001, Discrepancy threshold: α = 1.0, Balancing weight: λ = 0.5, Balancing weight: γ = 1.4, Optimizer: Adam.