Byzantine Stochastic Gradient Descent
Authors: Dan Alistarh, Zeyuan Allen-Zhu, Jerry Li
NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | This paper studies the problem of distributed stochastic optimization in an adversarial setting where, out of m machines which allegedly compute stochastic gradients every iteration, an -fraction are Byzantine, and may behave adversarially. Our main result is a variant of stochastic gradient descent (SGD) which finds "-approximate minimizers of convex functions in T = e O(" 1"2m + 2) iterations. Further, we provide a lower bound showing that, up to logarithmic factors, our algorithm is information-theoretically optimal both in terms of sample complexity and time complexity. |
| Researcher Affiliation | Collaboration | Dan Alistarh IST Austria dan.alistarh@ist.ac.at Zeyuan Allen-Zhu Microsoft Research AI zeyuan@csail.mit.edu Jerry Li Simons Institute jerryzli@berkeley.edu |
| Pseudocode | Yes | Algorithm 1 Byzantine SGD( , x1, D, T, TA, TB) |
| Open Source Code | No | The paper does not provide concrete access to source code or explicitly state its release. |
| Open Datasets | No | The paper is theoretical and does not use or mention any specific datasets. |
| Dataset Splits | No | The paper is theoretical and does not mention any dataset splits for training, validation, or testing. |
| Hardware Specification | No | The paper does not provide any specific hardware details used for running experiments, as it is a theoretical work. |
| Software Dependencies | No | The paper does not provide any specific software dependencies with version numbers. |
| Experiment Setup | No | The paper does not describe specific experimental setup details such as hyperparameters or training configurations, as it is a theoretical work. |