Proximal and Federated Random Reshuffling
Authors: Konstantin Mishchenko, Ahmed Khaled, Peter Richtarik
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, we corroborate our results with experiments on real data sets. 6. Experiments Our code is available on Git Hub: https://github.com/ konstmish/rr_prox_fed. More experimental details are in the appendix. Prox RR vs SGD. In Figure 1, we look at the logistic regression loss with the elastic net regularization, ... We set minibatch sizes to 32 for all methods and use theoretical stepsizes, without any tuning. |
| Researcher Affiliation | Academia | 1CNRS, DI ENS, Inria 2Princeton University 3KAUST. Correspondence to: Konstantin Mishchenko <konsta.mish@gmail.com>. |
| Pseudocode | Yes | Algorithm 1 Proximal Random Reshuffling (Prox RR) and Shuffle-Once (Prox SO) and Algorithm 2 Federated Random Reshuffling (Fed RR) and Algorithm 3 Proximal SGD (in Appendix C). |
| Open Source Code | Yes | Our code is available on Git Hub: https://github.com/ konstmish/rr_prox_fed. |
| Open Datasets | Yes | We use the w8a dataset4 for the experiment with ℓ1 regularization. The datasets were downloaded from Lib SVM https://www.csie.ntu.edu.tw/ cjlin/libsvmtools/datasets/binary.html |
| Dataset Splits | No | The paper mentions using w8a and a9a datasets and describes how heterogeneous data was created for a9a, but does not provide explicit training, validation, or test dataset splits or proportions. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU or CPU models, or detailed specifications of the machines used for experiments. |
| Software Dependencies | No | The paper mentions using 'Ray package' for parallelization but does not specify its version number or any other software dependencies with version details. |
| Experiment Setup | Yes | We set minibatch sizes to 32 for all methods and use theoretical stepsizes, without any tuning. For Fed RR, the initial stepsize was 1/L in the i.i.d. regime and 1/(L*n) in the heterogeneous regime. As per Theorem 3 in (Khaled et al., 2020), the stepsizes for Local SGD must satisfy γt = O(1/(LH)), where H is the number of local steps, a similar result holds for Scaffold (Karimireddy et al., 2020). ... For all methods, the local workers used minibatch size 16. |