Faster Adaptive Federated Learning
Authors: Xidong Wu, Feihu Huang, Zhengmian Hu, Heng Huang
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The experimental results on the language modeling task and image classification task with heterogeneous data demonstrate the efficiency of our algorithms. ... In this section, we evaluate our algorithms with language modeling task and image classification tasks. |
| Researcher Affiliation | Academia | 1 Electrical and Computer Engineering, University of Pittsburgh, Pittsburgh, PA, United States 2 College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing, China |
| Pseudocode | Yes | Algorithm 1: Naive adaptive Fed Avg Algorithm ... Algorithm 2: FAFED Algorithm |
| Open Source Code | No | The paper does not provide any explicit statements about releasing source code or a link to a code repository for the described methodology. |
| Open Datasets | Yes | The Wiki Text2 dataset is used in the experiment... We conduct image classification tasks on the Fashion-MNIST dataset as in (Nouiehed et al. 2019), MNIST dataset and CIFAR-10 dataset with 20 worker nodes in the network. |
| Dataset Splits | No | The fashion-MNIST dataset and MNIST dataset includes 60, 000 training images and 10, 000 testing images classified into 10 classes. Each image in both datasets contains 28 28 arrays of the grayscale pixel. CIFAR-10 dataset includes 50, 000 training images and 10, 000 testing images. The paper specifies training and testing image counts but does not explicitly mention or quantify a validation dataset split. |
| Hardware Specification | Yes | Experiments are implemented using Py Torch, and we run all experiments on CPU machines with 2.3 GHz Intel Core i9 as well as NVIDIA Tesla P40 GPU. |
| Software Dependencies | No | Experiments are implemented using Py Torch. The paper mentions PyTorch but does not provide specific version numbers for software dependencies. |
| Experiment Setup | Yes | We used a batch size of 20 and an inner loop number q of 10 in the experiment, and set the dropout rate as 0.5. To avoid the case of an exploding gradient in LSTM, we also clip the gradients by norm 0.25. ... In Fed Avg, SCAFFOLD, Fed Adam, and Fed AMS, we set the learning rate as 10. The global learning rate in the SCAFFOLD is 1. ... In STEM algorithm, we set κ as 20, w = σ = 1, and the step-size is diminished in each epoch... In FAFED algorithm, we set ρ h as 1 and w = 1 and decrease the step size as (5). In the Fed Adam, Fed AMS, STEM and FAFED algorithm, the momentum parameters, such as αt, β, β1 and β2 are chosen from the set {0.1, 0.9}. Their adaptive parameters τ or ρ are chosen as 0.01. ... We run grid search for step size, and choose the step size in the set {0.001, 0.01, 0.02, 0.05, 0.1}. ... The batch-size b is in {5, 50, 100} and the inner loop number q {5, 10, 20}. |