Implicit Gradient Alignment in Distributed and Federated Learning

Authors: Yatin Dandi, Luis Barba, Martin Jaggi6454-6462

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We experimentally validate the benefits of our algorithm in different distributed and federated learning settings. and We empirically demonstrate that Fed GA achieves better generalization than both Fed Avg (Mc Mahan et al. 2017a) and SCAFFOLD (Karimireddy et al. 2020).
Researcher Affiliation Academia 1 IIT Kanpur, India 2 EPFL, Switzerland
Pseudocode Yes Algorithm 1: Grad Align (GA) and Algorithm 2: Federated Gradient Alignment
Open Source Code No The paper does not provide any explicit statement or link regarding the availability of open-source code for the described methodology.
Open Datasets Yes We use the (balanced) EMNIST dataset (Cohen et al. 2017) consisting of 47 classes distributed among 47 clients, each receiving 2400 training examples. and We use the CIFAR10 dataset (Krizhevsky, Hinton et al. 2009) consisting of 50000 training examples split among 10 classes, which are then distributed among 10 clients, each receiving 5000 training examples.
Dataset Splits No The paper describes how data is distributed among clients and how clients are sampled, but it does not provide specific train/validation/test dataset splits in terms of percentages or counts for the overall datasets used.
Hardware Specification Yes All experiments were performed using Py Torch on Tesla V100-SXM2 with 32GB of memory.
Software Dependencies No The paper mentions using 'Py Torch' but does not specify its version number or any other software dependencies with their versions.
Experiment Setup Yes We use a constant learning rate throughout all our experiments to illustrate, as has been done in several federated learning papers (Mc Mahan et al. 2017b; Hsu, Qi, and Brown 2019; Khaled, Mishchenko, and Richtárik 2020; Liu et al. 2020). We also do not use batch normalization or momentum (neither server nor local momentum) in our experiments. and For EMNIST, we use a (simple) CNN neural network architecture for our experiments with 2 convolutional layers followed by a fully connected layer. The exact description of the network can be found in the Appendix .