Minibatch vs Local SGD for Heterogeneous Distributed Learning
Authors: Blake E. Woodworth, Kumar Kshitij Patel, Nati Srebro
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental evidence Finally, while Theorem 2 proves that Local SGD is worse than Minibatch SGD unless ζ2 is very small in the worst case, one might hope that for normal heterogeneous problems, Local SGD might perform better than its worst case error suggests. However, a simple binary logistic regression experiment on MNIST indicates that this behavior likely extends significantly beyond the worst case. The results, depicted in Figure 1, show that Local SGD performs worse than Minibatch SGD unless both ζ is very small and K is large. |
| Researcher Affiliation | Academia | Blake Woodworth Toyota Technological Institute at Chicago blake@ttic.edu Kumar Kshitij Patel Toyota Technological Institute at Chicago kkpatel@ttic.edu Nathan Srebro Toyota Technological Institute at Chicago nati@ttic.edu |
| Pseudocode | Yes | Algorithmic details including pseudo-code are provided in Appendix C.2. |
| Open Source Code | No | The paper does not provide any explicit statements about releasing source code or links to a code repository. |
| Open Datasets | Yes | Figure 1: Binary logistic regression between even vs odd digits of MNIST. |
| Dataset Splits | No | The paper mentions using MNIST for experiments but does not provide specific details on train/validation/test splits, percentages, or how the data was partitioned for validation. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers. |
| Experiment Setup | Yes | For both algorithms, we used the best fixed stepsize for each choice of K, R, and ζ individually. Additional details are provided in Appendix F. |