HeteroFL: Computation and Communication Efficient Federated Learning for Heterogeneous Clients

Authors: Enmao Diao, Jie Ding, Vahid Tarokh

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We trained over 600 individual models for exploring and demonstrating the effectiveness of our method. We experimented with MNIST and CIFAR10 image classification tasks and the Wiki Text2 language modeling task (Le Cun et al., 1998; Krizhevsky et al., 2009; Merity et al., 2016; Devlin et al., 2018).
Researcher Affiliation Academia Enmao Diao Department of Electrical and Computer Engineering Duke University Durhm, NC 27705, USA enmao.diao@duke.edu; Jie Ding School of Statistics University of Minnesota-Twin Cities Minneapolis, MN 55455, USA dingj@umn.edu; Vahid Tarokh Department of Electrical and Computer Engineering Duke University Durhm, NC 27705, USA vahid.tarokh@duke.edu
Pseudocode Yes We propose the complete pseudo-code for our Hetero FL framework in Algorithm 1.
Open Source Code No The paper does not contain an explicit statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets Yes We experimented with MNIST and CIFAR10 image classification tasks and the Wiki Text2 language modeling task (Le Cun et al., 1998; Krizhevsky et al., 2009; Merity et al., 2016; Devlin et al., 2018).
Dataset Splits No The paper discusses training and testing data but does not explicitly provide details on how the dataset is split into training, validation, and test sets, nor does it explicitly mention a validation set.
Hardware Specification No The paper does not provide specific details about the hardware (e.g., CPU/GPU models, memory) used for running the experiments.
Software Dependencies No The paper mentions deep learning models and techniques but does not specify any software libraries or dependencies with version numbers.
Experiment Setup Yes The details regarding hyperparameters and model architecture can be found in Table 6 of the Appendix. Table 6 lists Local Epoch E, Local Batch Size B, Optimizer SGD, Momentum, Weight decay, Learning rate η, Communication rounds, Decay schedule, Embedding Size, Number of heads, Dropout, and Sequence length.