SGDA with shuffling: faster convergence for nonconvex-PŁ minimax optimization

Authors: Hanseul Cho, Chulhee Yun

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To validate our main theoretical findings, here we present some numerical results. We focus on the primal-PŁ-strongly-concave (or PŁ(Φ)-SC, which is PŁ(Φ)-PŁ as well) quadratic games of the form min x Rd max y Rd f(x; y) = 1 2x Ax + x By 1 n Pn i=1 fi(x; y), where fi(x; y) = 1 2x Aix + x Biy 1 2y Ciy + u i x v i y.
Researcher Affiliation Academia Hanseul Cho, Chulhee Yun Kim Jaechul Graduate School of AI, KAIST {jhs4015, chulhee.yun}@kaist.ac.kr
Pseudocode Yes Algorithm 1 sim SGDA/alt SGDA-RR
Open Source Code No The paper does not contain any explicit statement about making the source code available or provide a link to a code repository.
Open Datasets No The paper describes generating 'quadratic games' for experiments, stating 'To make the game in Equation (5) satisfy PŁ(Φ)-SC and component L-smoothness, we should sample the coefficient matrices and vectors carefully.' It does not refer to a publicly available dataset with concrete access information.
Dataset Splits No The paper describes running experiments on generated quadratic games but does not specify any training, validation, or test dataset splits.
Hardware Specification No The paper does not provide any specific details about the hardware used to run the experiments (e.g., GPU/CPU models, memory specifications).
Software Dependencies No The paper does not specify any software dependencies with version numbers used for the experiments.
Experiment Setup Yes We compare six algorithms in total: sim SGDA-RR, alt SGDA-RR, AGDA-RR (as defined in Das et al. (2022)), and the with-replacement counterparts of these three algorithms. ... we run each algorithm for the same number of epochs using constant step sizes of ratio β/α = cκ2 2 for some constant c and κ2 = L/µ. ... We specify the values of parameters described above: n = 100, d = 25, µM = µC, and LC = 1 < LM = LB. The constants c0 and c1 are tuned among 10{ 2, 1.5, 1, 0.5, 0}.