A Stochastic Newton Algorithm for Distributed Convex Optimization

Authors: Brian Bullins, Kshitij Patel, Ohad Shamir, Nathan Srebro, Blake E. Woodworth

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In Section 5, we compare a more practical version of our method, FEDSN-LITE (Algorithm 6) against the other methods, showing we can significantly reduce communication compared to other first-order methods. ... In our experiments in Figure 6, we notice that FEDSN-LITE is either competitive with or outperforms the other baselines. This is especially true for the sparse communication settings, which are of most practical interest.
Researcher Affiliation Academia Brian Bullins Toyota Technological Institute at Chicago bbullins@ttic.edu Kumar Kshitij Patel Toyota Technological Institute at Chicago kkpatel@ttic.edu Ohad Shamir Weizmann Institute of Science ohad.shamir@weizmann.ac.il Nathan Srebro Toyota Technological Institute at Chicago nati@ttic.edu Blake Woodworth Toyota Technological Institute at Chicago blake@ttic.edu
Pseudocode Yes Algorithm 1 FEDERATED-STOCHASTIC-NEWTON, a.k.a., FEDSN(x0)
Open Source Code Yes Code is availabe at https://github.com/kishinmh/Inexact-Newton.
Open Datasets Yes Empirical comparison of FEDSN-LITE (Algorithm 6) to other methods (see Appendix G.1) on the LIBSVM a9a (Chang and Lin, 2011; Dua and Graff, 2017) dataset
Dataset Splits Yes For all algorithms, we use the default 80/20 train/test split of the LIBSVM a9a dataset as provided by LIBSVMtools. We use a 10% held-out set from the training set as the validation set for tuning hyperparameters.
Hardware Specification No The paper mentions running experiments on 'M parallel machines' but does not provide specific hardware details such as CPU or GPU models, or memory specifications used for the experiments.
Software Dependencies No The paper does not provide specific software dependencies with version numbers (e.g., Python version, library versions like PyTorch or TensorFlow, or specific LIBSVM version used).
Experiment Setup Yes For all algorithms, we tune the learning rate from the set {0.001, 0.005, 0.01, 0.05, 0.1, 0.5, 1, 5, 10} and batch size from the set {1, 2, 4, 8, 16, 32, 64, 128, 256}. For algorithms with momentum (Minibatch SGD, Local SGD, FEDAC, and FEDSN-LITE), we tune the momentum parameter from the set {0.1, 0.3, 0.5, 0.7, 0.9}.