Efficient Algorithms for Smooth Minimax Optimization

Authors: Kiran K. Thekumparampil, Prateek Jain, Praneeth Netrapalli, Sewoong Oh

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In Section 5, we present empirical evaluation of our algorithm for nonconvex-concave setting and compare it to a state-of-the-art algorithm. We empirically verify the performance of Prox-FDIAG (Algorithm 5 in Appendix C) on a synthetic finite max-type nonconvex minimization problem (P3). In Figure 1, we plot the norm of gradient of Moreau envelope f 1/2L (xk) 2 against the number of iterations k in log-log scale. We see that, Prox-FDIAG and Adaptive Prox-FDIAG have a faster convergence rate than subgradient method, and Adaptive Prox-FDIAG is almost always faster than Prox-FDIAG. We provide more details about the algorithms and the experiments in Appendix E.
Researcher Affiliation Collaboration Kiran Koshy Thekumprampil University of Illinois at Urbana-Champaign thekump2@illinois.edu Prateek Jain Microsoft Research, India prajain@microsoft.com Praneeth Netrapalli Microsoft Research, India praneeth@microsoft.com Sewoong Oh University of Washington, Seattle sewoong@cs.washington.edu
Pseudocode Yes Algorithm 1: Dual Implicit Accelerated Gradient (DIAG) for strongly-convex concave programming... Algorithm 2: Proximal Dual Implicit Accelerated Gradient (Prox-DIAG) for nonconvex concave programming
Open Source Code No The paper does not include any explicit statements or links indicating that the source code for the described methodology is publicly available.
Open Datasets No We empirically verify the performance of Prox-FDIAG (Algorithm 5 in Appendix C) on a synthetic finite max-type nonconvex minimization problem (P3). We consider the following problem. minx R2 f(x) = max1 i m=9 fi(x) where fi(x) = q( 1, (X(1) i ,X(2) i ), Ci)(x) for all 1 i 8, where q(a,b,c)(x) = a x b 2 2 + c, X(1) i , X(2) i , and ci are randomly generated.
Dataset Splits No The paper uses a synthetically generated dataset for experiments and does not mention any explicit train/validation/test splits, nor does it refer to predefined splits from external datasets or cross-validation.
Hardware Specification No The paper does not specify any particular hardware (e.g., GPU/CPU models, cloud instances) used for running the experiments.
Software Dependencies Yes Appendix E, Experimental Details: The experiments were implemented in Python 3.7 using PyTorch 1.0.0.
Experiment Setup Yes Appendix E, Experimental Details: The experiments were implemented in Python 3.7 using PyTorch 1.0.0. We consider the function f(x) = max1 i=100 fi(x) where fi(x) = q( 1, (X(1) i ,X(2) i ), Ci)(x) + 0.1 cos(pi x1) + 0.1 cos(pi x2) and q(a,b,c)(x) = a x b 2 2 + c. The values X(1) i , X(2) i , and ci are randomly generated. The initial point x0 is chosen uniformly randomly from [ 1, 1]2. For Prox-FDIAG, the parameter for Moreau envelope is set to λ = 1/(2L). The inner loop of Prox-FDIAG is run for R=20 iterations. For Adaptive Prox-FDIAG, the parameter λ is adaptively tuned using backtracking line search, starting with an initial λ0 = 1/(2L). For subgradient method, the step size is set to α = 1/(L k).