Resource-Adaptive Federated Learning with All-In-One Neural Composition

Authors: Yiqun Mei, Pengfei Guo, Mo Zhou, Vishal Patel

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiment results on popular FL benchmarks demonstrate the effectiveness of our approach. To validate our approach, we conduct extensive experiments under both statistical data heterogeneity (IID and non-IID distribution) and system heterogeneity (static and dynamic) settings. We evaluate our method on commonly used datasets, i.e., Fashion-MNIST [23], CIFAR10 and CIFAR100 [22] for image classification, as well as Shakespeare [24] for next-character prediction.
Researcher Affiliation Academia Yiqun Mei Pengfei Guo Mo Zhou Vishal M. Patel Johns Hopkins University {ymei7,pguo4,mzhou32,vpatel36}@jhu.edu
Pseudocode Yes Algorithm 1: Federated Learning with All-In-One Neural Composition (FLANC)
Open Source Code No [No] (3a) Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)?
Open Datasets Yes In this paper, we evaluate our method for image classification tasks on three popular datasets with increasing complexity: Fashion-MNIST [23], CIFAR10 and CIFAR100 [22]. Fashion MNIST is a relatively simple dataset containing 60,000 examples of 10 classes. CIFAR10 and CIFAR100 are the common classification benchmarks with 50,000 training images with 10 and 100 classes respectively. CIFAR100 is the most challenging dataset as each class only has 500 images. Our approach is generally applicable to not only vision tasks, but also natural language processing tasks. To show this, we further conduct experiments on Shakespeare, which is a text dataset built from Shakespeare Dialogues [35], and the task is next-character prediction.
Dataset Splits Yes For IID partition, we uniformly sample a same number of data for each client. For non-IID case, we assume label shifts and distribute a subset of classes for each client. Specifically, the number of classes is set to 3 for Fashion-MNIST and CIFAR10, 30 for CIFAR100. For Shakespeare, we follow the partition method in [24]. For hyper-parameter tuning, we split a subset of 10% training examples as the validation set.
Hardware Specification Yes We implement the proposed approach using Py Torch 1.8.2 on Nvidia A5000 GPUs.
Software Dependencies Yes We implement the proposed approach using Py Torch 1.8.2 on Nvidia A5000 GPUs.
Experiment Setup Yes For hyper-parameter tuning, we split a subset of 10% training examples as the validation set. After selecting the parameters, validation data are merged back to the training set, and then we retrain the model for final performance evaluation. For our method, the selection of λ, R1 and R2 depends on architecture and tasks. We implement the proposed approach using Py Torch 1.8.2 on Nvidia A5000 GPUs. Detailed descriptions of hyper-parameters and training can be found in the Appendix.