reproducibilityindex.ai

Enhancing One-Shot Federated Learning Through Data and Ensemble Co-Boosting

Authors: Rong Dai, Yonggang Zhang, Ang Li, Tongliang Liu, Xun Yang, Bo Han

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments demonstrate that Co-Boosting can substantially outperform existing baselines under various settings. Moreover, Co-Boosting eliminates the need for adjustments to the client s local training, requires no additional data or model transmission, and allows client models to have heterogeneous architectures.
Researcher Affiliation	Collaboration	Rong Dai1,2, Yonggang Zhang2, Ang Li3, Tongliang Liu4, Xun Yang1, , Bo Han2 1University of Science and Technology of China, 2TMLR Group, Hong Kong Baptist University 3ECE Department, University of Maryland College Park, 4Sydney AI Centre, The University of Sydney
Pseudocode	Yes	Algorithm 1 Co-Boosting
Open Source Code	Yes	Code is available at https://github.com/rong-dai/Co-Boosting
Open Datasets	Yes	We conduct experiments on five real-world image datasets that are standard in the FL literature: MNIST (Le Cun et al., 1998), FMNIST (Xiao et al., 2017), SVHN (Netzer et al., 2011), CIFAR10, and CIFAR100 (Krizhevsky et al., 2009).
Dataset Splits	Yes	For the Fed DF method, we use 20% of the training set as a validation set for distillation.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running experiments.
Software Dependencies	No	The paper mentions software like CNN, LeNet-5, PyTorch tutorial (Paszke et al., 2019), SGD optimizer, and Adam optimizer but does not provide specific version numbers for these software dependencies or libraries, which are necessary for full reproducibility.
Experiment Setup	Yes	For each client s local training, we use the SGD optimizer with momentum=0.9 and learning rate=0.01. We set the batch size to 128 and the local epoch to 300. The generator we use is the same as in Zhang et al. (2022a); Chen et al. (2019) and it is trained by Adam optimizer with a learning rate ηg = 1e3 over TG = 30 rounds. The distillation temperature in the knowledge distillation stage used in the server model stage is set to 4, while the temperature used in the KL loss in the generator loss is set to 1. The perturbation strength is set to ϵ = 8/255 and the step size µ is set to 0.1/n. For the training of the server model f S( ), we use the SGD optimizer with learning rate ηS = 0.01 and momentum=0.9. The number of total epochs T is set to 500.