Architecture Agnostic Federated Learning for Neural Networks
Authors: Disha Makhija, Xing Han, Nhat Ho, Joydeep Ghosh
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The extensive experimental results demonstrate that the Fed He NN framework is capable of learning better performing models on clients in both the settings of homogeneous and heterogeneous architectures across clients. and 4. Experiments We now present the effectiveness of the Fed He NN framework using empirical results on different datasets and models. |
| Researcher Affiliation | Academia | 1The University of Texas at Austin, Austin, Texas, USA. |
| Pseudocode | Yes | The algorithms for homogeneous and heterogeneous settings are described in Algorithm 1 and 2 respectively. |
| Open Source Code | No | The paper does not provide any statement or link regarding the availability of its source code. |
| Open Datasets | Yes | We use datasets corresponding to these from the popular federated learning benchmark LEAF (Caldas et al., 2019). For the image classification task, we use CIFAR-10 and CIFAR-100 datasets that contain colored images in 10 and 100 classes respectively. And for the text classification task we use a binary classification dataset called Sentiment140. |
| Dataset Splits | Yes | We partition the entire data to generate non-iid samples on each client and then split those into training and test sets at the client site. and For hyperparameter tuning of methods, we utilise a global validation dataset which is not shared with any of the clients. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU/CPU models, processor types, or memory used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details with version numbers. |
| Experiment Setup | Yes | The hyperparameter η that controls the contribution of representation similarity in the objective function is kept as a function of t (the number of communication round). The base value of η0 is tuned as a hyperparameter and the best performance is obtained by keeping η0 = 0.001 for CIFAR-10 and CIFAR-100, and η0 = 0.01 for Sentiment140. The size of RAD is an important parameter. The reported performance is obtained by keeping this size constant at 5000... The total number of communication rounds is kept constant at 200 for all algorithms and at each round only 10% of the clients are sampled and updated... the number of local epochs is set to 20. In each local update, we use SGD with momentum for training. |