On the convergence of single-call stochastic extra-gradient methods
Authors: Yu-Guan Hsieh, Franck Iutzeler, Jérôme Malick, Panayotis Mertikopoulos
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Figure 1: Illustration of the performance of EG and 1-EG in the (a priori non-monotone) saddle-point problem L(θ, φ) = 2ϵ1θ A1θ + ϵ2 θ A2θ 2 2ϵ1φ B1φ ϵ2 φ B2φ 2 + 4θ Cφ on the full unconstrained space X = Rd = Rd1 d2 with d1 = d2 = 1000 and A1, B1, A2, B2 0. We choose three situations representative of the settings considered in the paper: (a) linear convergence of the last iterate of the deterministic methods in strongly monotone problems; (b) the O(1/t) convergence of the ergodic average in monotone, deterministic problems; and (c) the O(1/t) local convergence rate of the method s last iterate in stochastic, non-monotone problems. For (a) and (b), the origin is the unique solution of (VI), and for (c) it is a regular solution thereof. We observe that 1-EG consistently outperforms EG in terms of oracle calls for a fixed step-size, and the observed rates are consistent with the rates reported in Table 1. |
| Researcher Affiliation | Academia | Yu-Guan Hsieh Univ. Grenoble Alpes, LJK and ENS Paris 38000 Grenoble, France. yu-guan.hsieh@ens.fr Franck Iutzeler Univ. Grenoble Alpes, LJK 38000 Grenoble, France. franck.iutzeler@univ-grenoble-alpes.fr Jérôme Malick CNRS, LJK 38000 Grenoble, France. jerome.malick@univ-grenoble-alpes.fr Panayotis Mertikopoulos Univ. Grenoble Alpes, CNRS, Inria, Grenoble INP, LIG 38000 Grenoble, France. panayotis.mertikopoulos@imag.fr |
| Pseudocode | Yes | The Extra-Gradient algorithm. In the general framework outlined in the previous section, the Extra-Gradient (EG) algorithm of Korpelevich [23] can be stated in recursive form as Xt+1/2 = ΠX (Xt γt Vt) Xt+1 = ΠX (Xt γt Vt+1/2) (EG) ... Single-call variants of the Extra-Gradient algorithm. Concretely, we will examine the following family of single-call extra-gradient (1-EG) algorithms: 1. Past Extra-Gradient (PEG) [10, 18, 37]: Xt+1/2 = ΠX (Xt γt Vt 1/2) Xt+1 = ΠX (Xt γt Vt+1/2) (PEG) ... 2. Reflected Gradient (RG) [8, 13, 25]: Xt+1/2 = Xt (Xt 1 Xt) Xt+1 = ΠX (Xt γt Vt+1/2) (RG) ... 3. Optimistic Gradient (OG) [14, 30, 31, 36]: Xt+1/2 = ΠX (Xt γt Vt 1/2) Xt+1 = Xt+1/2 + γt Vt 1/2 γt Vt+1/2 (OG) |
| Open Source Code | No | The paper does not provide any explicit statements about releasing source code or links to a code repository for the described methodology. |
| Open Datasets | No | Figure 1: Illustration of the performance of EG and 1-EG in the (a priori non-monotone) saddle-point problem L(θ, φ) = 2ϵ1θ A1θ + ϵ2 θ A2θ 2 2ϵ1φ B1φ ϵ2 φ B2φ 2 + 4θ Cφ on the full unconstrained space X = Rd = Rd1 d2 with d1 = d2 = 1000 and A1, B1, A2, B2 0. The experiment uses a synthetically defined problem/function, not a named publicly available dataset. |
| Dataset Splits | No | The experiments are based on a synthetically defined saddle-point problem, L(θ, φ), rather than a dataset that requires explicit training, validation, or test splits. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory, or cloud resources) used for conducting the experiments. |
| Software Dependencies | No | The paper does not provide details on specific software dependencies or their version numbers used in the experiments. |
| Experiment Setup | Yes | Figure 1: Illustration of the performance of EG and 1-EG in the (a priori non-monotone) saddle-point problem L(θ, φ) = 2ϵ1θ A1θ + ϵ2 θ A2θ 2 2ϵ1φ B1φ ϵ2 φ B2φ 2 + 4θ Cφ on the full unconstrained space X = Rd = Rd1 d2 with d1 = d2 = 1000 and A1, B1, A2, B2 0. We choose three situations representative of the settings considered in the paper: (a) linear convergence of the last iterate of the deterministic methods in strongly monotone problems; (b) the O(1/t) convergence of the ergodic average in monotone, deterministic problems; and (c) the O(1/t) local convergence rate of the method s last iterate in stochastic, non-monotone problems. For (a) and (b), the origin is the unique solution of (VI), and for (c) it is a regular solution thereof. We observe that 1-EG consistently outperforms EG in terms of oracle calls for a fixed step-size, and the observed rates are consistent with the rates reported in Table 1. (EG 1-EG γ = 0.1 γ = 0.2 γ = 0.3 γ = 0.4), (EG 1-EG γ = 0.2 γ = 0.4 γ = 0.6 γ = 0.8), (EG 1-EG γ = 0.3 γ = 0.6 γ = 0.9 γ = 1.2), (b = 15) |