Byzantine Machine Learning Made Easy By Resilient Averaging of Momentums
Authors: Sadegh Farhadkhani, Rachid Guerraoui, Nirupam Gupta, Rafael Pinot, John Stephan
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We also present an empirical evaluation of the practical relevance of RESAM. We report on a comprehensive set of experiments evaluating RESAM on benchmark image classification tasks: MNIST, Fashion-MNIST, and CIFAR10. |
| Researcher Affiliation | Academia | Sadegh Farhadkhani 1 Rachid Guerraoui 1 Nirupam Gupta 1 Rafael Pinot 1 John Stephan 1 1Distributed Computing Laboratory (DCL), School of Computer and Communication Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Switzerland. |
| Pseudocode | Yes | Algorithm 1: Distributed SGD using distributed momentum and an (f, λ)-resilient averaging rule F |
| Open Source Code | Yes | Additional plots and code base to reproduce our experiments are available in the supplementary material. Our implementation will also be made accessible online. |
| Open Datasets | Yes | We use MNIST (Le Cun & Cortes, 2010), Fashion MNIST (Xiao et al., 2017), and CIFAR-10 (Krizhevsky et al., 2009). |
| Dataset Splits | No | The paper uses standard datasets (MNIST, Fashion-MNIST, CIFAR-10) which have predefined train/test splits, and discusses 'top-1 cross-accuracy' for evaluation, but it does not explicitly state the dataset split percentages or sample counts for training, validation, and testing in the provided text. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running the experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies (e.g., library or framework names with version numbers) used for the experiments. |
| Experiment Setup | Yes | For MNIST and Fashion-MNIST... we use a constant learning rate γ = 0.75, and a clipping parameter C = 2. We also add an ℓ2-regularization factor of 10 4. Finally, we use a mini-batch size of b = 25. For CIFAR-10... We set n = 25, γ = 0.25, C = 5, and b = 50. ... Finally, we vary the momentum coefficient β in {0, 0.6, 0.8, 0.9, 0.99, 0.999}. |