Data-Free One-Shot Federated Learning Under Very High Statistical Heterogeneity

Authors: Clare Elizabeth Heinbaugh, Emilio Luz-Ricca, Huajie Shao

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Thorough experiments on multiple benchmark datasets (MNIST, Fashion MNIST, SVHN) demonstrate the superiority of FEDCVAE-ENS and FEDCVAE-KD over other relevant one-shot FL methods in the high statistical heterogeneity setting.
Researcher Affiliation Academia Clare Heinbaugh , Emilio Luz-Ricca , Huajie Shao Department of Computer Science, William & Mary, VA, USA {ceheinbaugh, eluzricca, hshao}@wm.edu
Pseudocode Yes Algorithm 1 FEDCVAE-KD in the one-shot FL setting. TL represents the number of local training epochs. The server decoder parameters are θS, with KD training epochs TKD, number of KD training samples n D, KD loss ℓKD( ), and KD learning rate ηKD. The server classifier parameters are w S C, with training epochs TC, number of training samples n C, classification loss ℓC( ), and learning rate ηC. C is the set of clients.
Open Source Code Yes The code used to implement our proposed methods and carry out all experiments is included in the following public repository: https://github.com/ceh-2000/fed_cvae.
Open Datasets Yes To validate FEDCVAE-ENS and FEDCVAE-KD, we conduct experiments on three image datasets that are standard in the FL literature: MNIST (Lecun et al., 1998), Fashion MNIST (Xiao et al., 2017), and SVHN (Netzer et al., 2011).
Dataset Splits No The paper specifies the number of train and test samples for each dataset (e.g., 'MNIST and Fashion MNIST contain... 60,000 train samples and 10,000 test samples'), but it does not explicitly define or quantify a separate validation set split.
Hardware Specification No The paper describes the experimental setup and training process but does not provide specific details about the hardware used (e.g., GPU/CPU models, memory, or cloud instances).
Software Dependencies No The paper mentions using Adam as an optimizer but does not specify any programming languages or software libraries with version numbers (e.g., Python, PyTorch, TensorFlow versions) that were used for implementation.
Experiment Setup Yes Unless otherwise stated, we use m = 10 clients, α = 0.01 (very heterogeneous), and report average test accuracy across 5 seeded parameter intializations one standard deviation. [...] Full hyperparameter settings can be found in Table 3 and Table 4 (Appendix C). Table 3 lists: Shared Local learning rate (η) 0.001, Classifier optimizer Adam, Batch size (all) 32. Table 4 provides dataset-specific hyperparameters for local epochs, CVAE latent dimension, server classifier epochs, server decoder epochs, and truncated normal width.