SGDA with shuffling: faster convergence for nonconvex-PŁ minimax optimization
Authors: Hanseul Cho, Chulhee Yun
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To validate our main theoretical findings, here we present some numerical results. We focus on the primal-PŁ-strongly-concave (or PŁ(Φ)-SC, which is PŁ(Φ)-PŁ as well) quadratic games of the form min x Rd max y Rd f(x; y) = 1 2x Ax + x By 1 n Pn i=1 fi(x; y), where fi(x; y) = 1 2x Aix + x Biy 1 2y Ciy + u i x v i y. |
| Researcher Affiliation | Academia | Hanseul Cho, Chulhee Yun Kim Jaechul Graduate School of AI, KAIST {jhs4015, chulhee.yun}@kaist.ac.kr |
| Pseudocode | Yes | Algorithm 1 sim SGDA/alt SGDA-RR |
| Open Source Code | No | The paper does not contain any explicit statement about making the source code available or provide a link to a code repository. |
| Open Datasets | No | The paper describes generating 'quadratic games' for experiments, stating 'To make the game in Equation (5) satisfy PŁ(Φ)-SC and component L-smoothness, we should sample the coefficient matrices and vectors carefully.' It does not refer to a publicly available dataset with concrete access information. |
| Dataset Splits | No | The paper describes running experiments on generated quadratic games but does not specify any training, validation, or test dataset splits. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware used to run the experiments (e.g., GPU/CPU models, memory specifications). |
| Software Dependencies | No | The paper does not specify any software dependencies with version numbers used for the experiments. |
| Experiment Setup | Yes | We compare six algorithms in total: sim SGDA-RR, alt SGDA-RR, AGDA-RR (as defined in Das et al. (2022)), and the with-replacement counterparts of these three algorithms. ... we run each algorithm for the same number of epochs using constant step sizes of ratio β/α = cκ2 2 for some constant c and κ2 = L/µ. ... We specify the values of parameters described above: n = 100, d = 25, µM = µC, and LC = 1 < LM = LB. The constants c0 and c1 are tuned among 10{ 2, 1.5, 1, 0.5, 0}. |