reproducibilityindex.ai

ProxSkip: Yes! Local Gradient Steps Provably Lead to Communication Acceleration! Finally!

Authors: Konstantin Mishchenko, Grigory Malinovsky, Sebastian Stich, Peter Richtarik

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To test the performance of algorithms and illustrate theoretical results, we use classical logistic regression problem. In our experiments, we have two settings: deterministic (Figure 1) and stochastic problems (Figure 2).
Researcher Affiliation	Academia	1CNRS, ENS, Inria Sierra, Paris, France 2Computer Science, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia 3CISPA Helmholtz Center for Information Security, Saarbr ucken, Germany.
Pseudocode	Yes	Algorithm 1 Prox Skip ... Algorithm 2 Scaffnew: Application of Prox Skip to Federated Learning (i.e., to problem (6) (7)) ... Algorithm 3 Decentralized Scaffnew ... Algorithm 4 SProx Skip (Stochastic gradient version of Prox Skip) ... Algorithm 5 Split Skip
Open Source Code	Yes	Our code is available on Git Hub: https://github.com/alarcoelectro/Prox Skip-Public
Open Datasets	Yes	We use the w8a dataset from LIBSVM library (Chang & Lin, 2011).
Dataset Splits	No	The paper mentions using the "w8a dataset from LIBSVM library" but does not provide explicit details about how it was split into training, validation, or test sets (e.g., percentages, sample counts, or a specific split methodology).
Hardware Specification	Yes	All methods were evaluated on a workstation with an Intel(R) Xeon(R) Gold 6146 CPU at 3.20GHz with 24 cores.
Software Dependencies	No	We implemented all algorithms in Python using the package RAY (Moritz et al., 2018). While it names Python and the RAY package, it does not provide specific version numbers for either, which is required for reproducibility.
Experiment Setup	Yes	We set the regularization parameter λ = 10 4L, where L is the smoothness constant. ... The number of local steps is set to be κˆ, where κˆ = L/µˆ is the estimated condition number. ... Our theory predicted that the choice p = 1 κ is optimal, which is close to the experiments results. Finally Figure 2 (c), we compared Scaffnew in stochastic case with different number of clients M.