Communication-Efficient Stochastic Gradient Descent Ascent with Momentum Algorithms
Authors: Yihan Zhang, Meikang Qiu, Hongchang Gao
IJCAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, we apply our algorithm to the distributed AUC maximization problem for the imbalanced data classification task. Extensive experimental results confirm the efficacy of our algorithm in saving communication cost. We conducted extensive experiments on the imbalanced classification task, which confirms the effectiveness of our algorithms. In Figure 1, we report the testing AUC score versus the number of epochs on testing sets. |
| Researcher Affiliation | Academia | 1Temple University, Philadelphia, PA, USA 2Dakota State University, Madison, SD, USA |
| Pseudocode | Yes | Algorithm 1 SGDAM-PEF ... Algorithm 2 SGDAM-REF |
| Open Source Code | No | The paper does not contain any explicit statements about releasing source code or links to code repositories. |
| Open Datasets | Yes | Datasets. In our experiments, five benchmark datasets are employed to evaluate the performance of our algorithm. They are CATvs DOG 1, CIFAR10, CIFAR100 2, STL10 [Coates et al., 2011], Melanoma [Rotemberg et al., 2021]. ... 1https://www.kaggle.com/c/dogs-vs-cats 2https://www.cs.toronto.edu/ kriz/cifar.html |
| Dataset Splits | No | The training set is randomly distributed to all workers, while the testing set are the same for all workers. The paper mentions train and test sets but does not specify a validation set or explicit train/validation/test split percentages. |
| Hardware Specification | Yes | Here we use four workers where each worker is a V100-GPU. |
| Software Dependencies | No | The paper mentions employing a quantization operator but does not provide specific software names with version numbers for reproducibility (e.g., Python 3.x, PyTorch 1.x, CUDA 11.x). |
| Experiment Setup | Yes | Input: η > 0, γ > 0, λ > 0, ρ1 > 0, ρ2 > 0, r0 = 0, s0 = 0. The compression operator in our experiment include Top-k and Rand-k where k = 20%. ... the quantization level is set to 4. ... we use the equivalent learning rate for all algorithms, i.e., 0.1. |