Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Learning Personalized Causally Invariant Representations for Heterogeneous Federated Clients

Authors: Xueyang Tang, Song Guo, Jie ZHANG, Jingcai Guo

ICLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The experimental results on diverse datasets validate the superiority of Fed SDR over the state-of-the-art PFL methods on OOD generalization performance.
Researcher Affiliation Academia Xueyang Tang1, Song Guo2 , Jie Zhang1 & Jingcai Guo1 1The Hong Kong Polytechnic University 2The Hong Kong University of Science and Technology
Pseudocode Yes Algorithm 1 Fed SDR: Federated Learning with Shortcut Discovery and Removal
Open Source Code Yes Code is available at https://github.com/Tangx-yy/Fed SDR.
Open Datasets Yes Colored-MNIST (CMNIST) (Arjovsky et al., 2019) is constructed based on MNIST (Le Cun et al., 1998), Colored Fashion-MNIST (CFMNIST) (Ahuja et al., 2020), Water Bird (Sagawa et al., 2019), PACS (Li et al., 2017).
Dataset Splits Yes The hyper-parameters of the competitors and our algorithm are tuned to make the accuracy on the validation environment (i.e., pe val = 0.10) as high as possible. and We adopt the leave-one-domain-out strategy to evaluate the OOD generalization performance.
Hardware Specification Yes We simulate a set of clients and a centralized server on one deep learning workstation (Intel(R) Core(TM) i9-12900K CPU @ 3.20GHz with one NVIDIA Ge Force RTX 3090 GPU).
Software Dependencies No The paper mentions 'Py Torch' as the implementation framework but does not provide specific version numbers for PyTorch or any other software dependencies.
Experiment Setup Yes The hyper-parameters of the competitors and our algorithm are tuned to make the accuracy on the validation environment (i.e., pe val = 0.10) as high as possible. Specifically, the mainly used hyperparameters in the evaluation part are listed as follows: Global communication round: T = 600, Local iterations: R = 10, Personalized epochs to update the personalized invariant predictors: K = 10, Local batch size: B = 50, Global learning rate: β = 0.0001, Personalized learning rate: η = 0.0001, Discrepancy threshold: α = 1.0, Balancing weight: λ = 0.5, Balancing weight: γ = 1.4, Optimizer: Adam.