Proximal and Federated Random Reshuffling

Authors: Konstantin Mishchenko, Ahmed Khaled, Peter Richtarik

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, we corroborate our results with experiments on real data sets. 6. Experiments Our code is available on Git Hub: https://github.com/ konstmish/rr_prox_fed. More experimental details are in the appendix. Prox RR vs SGD. In Figure 1, we look at the logistic regression loss with the elastic net regularization, ... We set minibatch sizes to 32 for all methods and use theoretical stepsizes, without any tuning.
Researcher Affiliation Academia 1CNRS, DI ENS, Inria 2Princeton University 3KAUST. Correspondence to: Konstantin Mishchenko <konsta.mish@gmail.com>.
Pseudocode Yes Algorithm 1 Proximal Random Reshuffling (Prox RR) and Shuffle-Once (Prox SO) and Algorithm 2 Federated Random Reshuffling (Fed RR) and Algorithm 3 Proximal SGD (in Appendix C).
Open Source Code Yes Our code is available on Git Hub: https://github.com/ konstmish/rr_prox_fed.
Open Datasets Yes We use the w8a dataset4 for the experiment with ℓ1 regularization. The datasets were downloaded from Lib SVM https://www.csie.ntu.edu.tw/ cjlin/libsvmtools/datasets/binary.html
Dataset Splits No The paper mentions using w8a and a9a datasets and describes how heterogeneous data was created for a9a, but does not provide explicit training, validation, or test dataset splits or proportions.
Hardware Specification No The paper does not provide specific hardware details such as GPU or CPU models, or detailed specifications of the machines used for experiments.
Software Dependencies No The paper mentions using 'Ray package' for parallelization but does not specify its version number or any other software dependencies with version details.
Experiment Setup Yes We set minibatch sizes to 32 for all methods and use theoretical stepsizes, without any tuning. For Fed RR, the initial stepsize was 1/L in the i.i.d. regime and 1/(L*n) in the heterogeneous regime. As per Theorem 3 in (Khaled et al., 2020), the stepsizes for Local SGD must satisfy γt = O(1/(LH)), where H is the number of local steps, a similar result holds for Scaffold (Karimireddy et al., 2020). ... For all methods, the local workers used minibatch size 16.