Adaptive Extra-Gradient Methods for Min-Max Optimization and Games
Authors: Kimon Antonakopoulos, Veronica Belmega, Panayotis Mertikopoulos
ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conclude in this section with a numerical illustration of the convergence properties of Ada Prox in two different settings: a) bilinear min-max games; and b) a simple Wasserstein GAN in the spirit of Daskalakis et al. [12] with the aim of learning an unknown covariance matrix. ... Figure 2: Numerical comparison between the extra-gradient (EG), Bach Levy (BL) and Ada Prox algorithms (red circles, green squares and blue triangles respectively). |
| Researcher Affiliation | Collaboration | Kimon Antonakopoulos Univ. Grenoble Alpes, CNRS, Inria, Grenoble INP LIG, 38000 Grenoble, France kimon.antonakopoulos@inria.fr E. Veronica Belmega ETIS/ENSEA Univ. de Cergy-Pontoise-CNRS, France belmega@ensea.fr Panayotis Mertikopoulos Univ. Grenoble Alpes, CNRS, Inria, Grenoble INP, LIG, 38000 Grenoble, France & Criteo AI Lab panayotis.mertikopoulos@imag.fr |
| Pseudocode | Yes | Xt+1/2 = PXt( γt Vt) δt = Vt+1/2 Vt Xt+1/2, Xt+1 = PXt( γt Vt+1/2) γt+1 = 1 . q 1 + Pt s=1 δ2 s (Ada Prox) |
| Open Source Code | No | The paper does not provide information about open-source code for the described methodology. |
| Open Datasets | No | The paper describes generating synthetic data for its experiments (e.g., A drawn i.i.d. from a standard Gaussian) and defines a problem setup (e.g., Wasserstein GAN formulation) rather than using named publicly available datasets with access information. |
| Dataset Splits | No | The paper does not specify explicit training, validation, or test dataset splits. |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers. |
| Experiment Setup | Yes | For the experiments, we took d = 100, a mini-batch of m = 128 samples per update... The step-size parameter of the EG algorithm was chosen as γt = 0.025/√t, whereas the BL algorithm was run with diameter and gradient bound estimation parameters D0 = .5 and M0 = 2.5 respectively. |