A Stochastic Newton Algorithm for Distributed Convex Optimization
Authors: Brian Bullins, Kshitij Patel, Ohad Shamir, Nathan Srebro, Blake E. Woodworth
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In Section 5, we compare a more practical version of our method, FEDSN-LITE (Algorithm 6) against the other methods, showing we can significantly reduce communication compared to other first-order methods. ... In our experiments in Figure 6, we notice that FEDSN-LITE is either competitive with or outperforms the other baselines. This is especially true for the sparse communication settings, which are of most practical interest. |
| Researcher Affiliation | Academia | Brian Bullins Toyota Technological Institute at Chicago bbullins@ttic.edu Kumar Kshitij Patel Toyota Technological Institute at Chicago kkpatel@ttic.edu Ohad Shamir Weizmann Institute of Science ohad.shamir@weizmann.ac.il Nathan Srebro Toyota Technological Institute at Chicago nati@ttic.edu Blake Woodworth Toyota Technological Institute at Chicago blake@ttic.edu |
| Pseudocode | Yes | Algorithm 1 FEDERATED-STOCHASTIC-NEWTON, a.k.a., FEDSN(x0) |
| Open Source Code | Yes | Code is availabe at https://github.com/kishinmh/Inexact-Newton. |
| Open Datasets | Yes | Empirical comparison of FEDSN-LITE (Algorithm 6) to other methods (see Appendix G.1) on the LIBSVM a9a (Chang and Lin, 2011; Dua and Graff, 2017) dataset |
| Dataset Splits | Yes | For all algorithms, we use the default 80/20 train/test split of the LIBSVM a9a dataset as provided by LIBSVMtools. We use a 10% held-out set from the training set as the validation set for tuning hyperparameters. |
| Hardware Specification | No | The paper mentions running experiments on 'M parallel machines' but does not provide specific hardware details such as CPU or GPU models, or memory specifications used for the experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers (e.g., Python version, library versions like PyTorch or TensorFlow, or specific LIBSVM version used). |
| Experiment Setup | Yes | For all algorithms, we tune the learning rate from the set {0.001, 0.005, 0.01, 0.05, 0.1, 0.5, 1, 5, 10} and batch size from the set {1, 2, 4, 8, 16, 32, 64, 128, 256}. For algorithms with momentum (Minibatch SGD, Local SGD, FEDAC, and FEDSN-LITE), we tune the momentum parameter from the set {0.1, 0.3, 0.5, 0.7, 0.9}. |