Zeno++: Robust Fully Asynchronous SGD
Authors: Cong Xie, Sanmi Koyejo, Indranil Gupta
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results show that Zeno++ outperforms existing Byzantine-tolerant asynchronous SGD algorithms. We conduct experiments on two benchmarks: CIFAR-10 image classification dataset (Krizhevsky, 2009), and Wiki Text-2 language modeling dataset (Merity et al., 2017). Our empirical results show good performance compared to previous work. |
| Researcher Affiliation | Academia | 1Department of Computer Science, University of Illinois, Urbana-Champaign, USA. |
| Pseudocode | Yes | Algorithm 1 Zeno+. Algorithm 2 Zeno++. |
| Open Source Code | No | The detailed network architecture can be found in our submitted source code (will be released upon publication). |
| Open Datasets | Yes | We conduct experiments on two benchmarks: CIFAR-10 image classification dataset (Krizhevsky, 2009), and Wiki Text-2 language modeling dataset (Merity et al., 2017). |
| Dataset Splits | Yes | From the training set, we randomly extracted 2.5k of them as the validation set for Zeno++, the remaining are randomly partitioned onto all the workers. |
| Hardware Specification | No | This work was funded in part by the following grants: NSF IIS 1909577, NSF CNS 1908888, and a JP Morgan Chase Fellowship, along with computational resources donated by Intel, AWS, and Microsoft Azure. |
| Software Dependencies | No | The paper does not specify any software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | In all the experiments, we take the learning rate γ = 0.1, mini-batch size n = ns = 128, ρ = 0.002, ϵ = 0.1, k = 10. In all the experiments, we take the learning rate γ = 20, mini-batch size n = ns = 20, k = kw = 10, ρ = 10, ϵ = 2. |