reproducibilityindex.ai

Distributed Momentum for Byzantine-resilient Stochastic Gradient Descent

Authors: El Mahdi El Mhamdi, Rachid Guerraoui, Sébastien Rouault

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We assess the eﬀectiveness of our method over 736 diﬀerent training conﬁgurations, comprising the 2 state-of-the-art attacks and 6 defenses. For conﬁdence and reproducibility purposes, each conﬁguration is run 5 times with speciﬁed seeds (1 to 5), totalling 3680 runs.
Researcher Affiliation	Academia	El-Mahdi El-Mhamdi * Ecole Polytechnique, France el-mahdi.el-mhamdi@polytechnique.edu Rachid Guerraoui * Ecole Polytechnique F ed erale de Lausanne (EPFL), Switzerland rachid.guerraoui@epfl.ch S ebastien Rouault * Ecole Polytechnique F ed erale de Lausanne (EPFL), Switzerland sebastien.rouault@epfl.ch
Pseudocode	No	The paper describes the methods and formulations but does not include any explicit pseudocode blocks or figures labeled 'Algorithm' or 'Pseudocode'.
Open Source Code	Yes	We provide our code along with a script reproducing all of our results, both the experiments and the graphs, in one command. Details, including software and hardware dependencies, are available in Section C. Our contributed code is available at https://github.com/LPD-EPFL/Byzantine Momentum, or as a ZIP archive from Open Review (https://openreview.net/forum?id=H8UHdh WG6A3).
Open Datasets	Yes	Datasets MNIST, Fashion MNIST CIFAR-10, CIFAR-100 (83 samples/gradient) (50 samples/gradient)... Datasets are pre-processed before training. MNIST receives the same pre-processing as in Baruch et al. (2019): an input image normalization with mean 0.1307 and standard deviation 0.3081. Fashion MNIST, CIFAR-10 and CIFAR-100 are all expanded with horizontally ﬂipped images. For both CIFAR-10 and CIFAR-100, a per-channel normalization with means 0.4914, 0.4822, 0.4465 and standard deviations 0.2023, 0.1994, 0.2010 (Liu, 2019) has been applied.
Dataset Splits	No	The paper describes using test sets for evaluation ('top-1 cross-accuracy over the whole test set') but does not specify the training, validation, and test dataset splits (e.g., percentages or exact counts) for the datasets used in their experiments. It references existing works for datasets but does not define how splits were made for their specific experimental setup beyond implying a test set.
Hardware Specification	Yes	Hardware dependencies. We list below the hardware components used: 1 Intel(R) Core(TM) i7-8700K CPU @ 3.70GHz 2 Nvidia Ge Force GTX 1080 Ti 64 GB of RAM
Software Dependencies	Yes	Software dependencies. Python 3.7.3 has been used, over several GNU/Linux distributions (Debian 10, Ubuntu 18). Besides the standard libraries associated with Python 3.7.3, our scripts also depend on: numpy 1.19.1 torch 1.6.0 torchvision 0.7.0 pandas 1.1.0 matplotlib 3.0.2 PIL 7.2.0 requests 2.21.0 urllib3 1.24.1 chardet 3.0.4 certiﬁ 2018.08.24 idna 2.6 six 1.15.0 pytz 2020.1 dateutil 2.8.1 pyparsing 2.2.0 cycler 0.10.0 kiwisolver 1.0.1 cﬃ 1.13.2
Experiment Setup	Yes	Our experiments cover 2 models, 4 datasets, the 6 studied defenses under each of the 2 stateof-the-art attacks5, diﬀerent fractions of Byzantine workers (either half or a quarter), using Nestorov instead of classical momentum, plus unattacked settings where each worker is honest and the GAR is mere averaging. ... For model training, we use the negative log likelihood loss and respectively 10 4 and 10 2 ℓ2-regularization for the fully connected and convolutional models. We also clip gradients, ensuring their norms remain respectively below 2 and 5 for the fully connected and convolutional models.