Nesterov Meets Optimism: Rate-Optimal Separable Minimax Optimization

Authors: Chris Junchi Li, Huizhuo Yuan, Gauthier Gidel, Quanquan Gu, Michael Jordan

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we empirically study the performance of our AG-OG with restarting algorithm. In these experimental results, we study both deterministic [ B.1] and stochastic settings [ B.2], each of which we compare the state-of-the-art algorithms.
Researcher Affiliation Academia 1Department of Electrical Engineering and Computer Sciences, University of California, Berkeley 2Department of Computer Sciences, University of California, Los Angeles 3DIRO, Universit e de Montr eal and Mila 4Department of Statistics, University of California, Berkeley.
Pseudocode Yes Algorithm 1 Accelerated Gradient-Optimistic Gradient (AG-OG)(zag 0 , z0, z 1/2, K), Algorithm 2 Accelerated Gradient-Optimistic Gradient with restarting (AG-OG with restarting), Algorithm 3 Stochastic Accelerated Gradient-Optimistic Gradient (S-AG-OG)(zag 0 , z0, z 1/2, K)
Open Source Code No The paper does not provide explicit statements or links indicating the release of open-source code for the described methodology.
Open Datasets No We present results on synthetic quadratic game datasets: x A1x + y A2x y A3y, with various selections of the eigenvalues of A1, A2, A3.
Dataset Splits No The paper discusses convergence and empirical performance on synthetic datasets but does not describe train/validation/test splits.
Hardware Specification No The paper does not provide specific details about the hardware used for running the experiments.
Software Dependencies No The paper does not provide specific software dependencies with version numbers used for the experiments.
Experiment Setup Yes We use stepsize ηk = k+2 3LH(k+2) in both the AG-OG and the AG-OG with restarting algorithms and restart AG-OG with restarting once every 100 iterates. For the OGDA algorithm, we take stepsize η = 1 2(L LH) as is indicated by recent arts e.g. (Mokhtari et al., 2020b).