Federated Minimax Optimization: Improved Convergence Analyses and Algorithms

Authors: Pranay Sharma, Rohan Panda, Gauri Joshi, Pramod Varshney

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we present the empirical performance of the algorithms discussed in the previous sections. To evaluate the performance of Local SGDA and Momentum Local SGDA, we consider the problem of fair classification (Mohri et al., 2019; Nouiehed et al., 2019) using the Fashion MNIST dataset (Xiao et al., 2017). Similarly, we evaluate the performance of Local SGDA+ and Momentum Local SGDA+, a momentum-based algorithm (see Algorithm 5 in Appendix F), on a robust neural network training problem (Madry et al., 2018; Sinha et al., 2017), using the CIFAR10 dataset.
Researcher Affiliation Academia 1Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, PA. 2Department of Electrical Engineering and Computer Science, Syracuse University, Syracuse, NY.
Pseudocode Yes Algorithm 1 Local SGDA (Deng & Mahdavi, 2021); Algorithm 2 Momentum Local SGDA; Algorithm 3 Local SGD; Algorithm 4 Local SGDA+ (Deng & Mahdavi, 2021); Algorithm 5 Local SGDA+ (Deng & Mahdavi, 2021)
Open Source Code No The paper does not contain an explicit statement about the release of source code for the described methodology, nor does it provide a link to a code repository.
Open Datasets Yes We consider the problem of fair classification (Mohri et al., 2019; Nouiehed et al., 2019) using the Fashion MNIST dataset (Xiao et al., 2017)... using the CIFAR10 dataset.
Dataset Splits No The paper mentions data partitioning ('The data is partitioned across the clients using a Dirichlet distribution Dir20(0.1) as in (Wang et al., 2019)') but does not provide explicit training/validation/test dataset splits (e.g., specific percentages or sample counts).
Hardware Specification Yes We conducted our experiments on a cluster of 20 machines (clients), each equipped with an NVIDIA Titan X GPU.
Software Dependencies Yes We implemented our algorithm based on parallel training tools offered by Py Torch 1.0.0 and Python 3.6.3.
Experiment Setup Yes Table 3. Parameter values for experiments in Section 5.1 Parameter Learning Rate (ηy) 0.02 2 10 3 2 10 4 Learning Rate (ηx) 0.016 1.6 10 3 1.6 10 4 Communication rounds 150 75 75; Batch-size of 32 is used. Momentum parameter 0.9 is used only in Momentum Local SGDA (Algorithm 2) and corresponds to αβ in the pseudocode.