Efficient Algorithms for Smooth Minimax Optimization
Authors: Kiran K. Thekumparampil, Prateek Jain, Praneeth Netrapalli, Sewoong Oh
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In Section 5, we present empirical evaluation of our algorithm for nonconvex-concave setting and compare it to a state-of-the-art algorithm. We empirically verify the performance of Prox-FDIAG (Algorithm 5 in Appendix C) on a synthetic finite max-type nonconvex minimization problem (P3). In Figure 1, we plot the norm of gradient of Moreau envelope f 1/2L (xk) 2 against the number of iterations k in log-log scale. We see that, Prox-FDIAG and Adaptive Prox-FDIAG have a faster convergence rate than subgradient method, and Adaptive Prox-FDIAG is almost always faster than Prox-FDIAG. We provide more details about the algorithms and the experiments in Appendix E. |
| Researcher Affiliation | Collaboration | Kiran Koshy Thekumprampil University of Illinois at Urbana-Champaign thekump2@illinois.edu Prateek Jain Microsoft Research, India prajain@microsoft.com Praneeth Netrapalli Microsoft Research, India praneeth@microsoft.com Sewoong Oh University of Washington, Seattle sewoong@cs.washington.edu |
| Pseudocode | Yes | Algorithm 1: Dual Implicit Accelerated Gradient (DIAG) for strongly-convex concave programming... Algorithm 2: Proximal Dual Implicit Accelerated Gradient (Prox-DIAG) for nonconvex concave programming |
| Open Source Code | No | The paper does not include any explicit statements or links indicating that the source code for the described methodology is publicly available. |
| Open Datasets | No | We empirically verify the performance of Prox-FDIAG (Algorithm 5 in Appendix C) on a synthetic finite max-type nonconvex minimization problem (P3). We consider the following problem. minx R2 f(x) = max1 i m=9 fi(x) where fi(x) = q( 1, (X(1) i ,X(2) i ), Ci)(x) for all 1 i 8, where q(a,b,c)(x) = a x b 2 2 + c, X(1) i , X(2) i , and ci are randomly generated. |
| Dataset Splits | No | The paper uses a synthetically generated dataset for experiments and does not mention any explicit train/validation/test splits, nor does it refer to predefined splits from external datasets or cross-validation. |
| Hardware Specification | No | The paper does not specify any particular hardware (e.g., GPU/CPU models, cloud instances) used for running the experiments. |
| Software Dependencies | Yes | Appendix E, Experimental Details: The experiments were implemented in Python 3.7 using PyTorch 1.0.0. |
| Experiment Setup | Yes | Appendix E, Experimental Details: The experiments were implemented in Python 3.7 using PyTorch 1.0.0. We consider the function f(x) = max1 i=100 fi(x) where fi(x) = q( 1, (X(1) i ,X(2) i ), Ci)(x) + 0.1 cos(pi x1) + 0.1 cos(pi x2) and q(a,b,c)(x) = a x b 2 2 + c. The values X(1) i , X(2) i , and ci are randomly generated. The initial point x0 is chosen uniformly randomly from [ 1, 1]2. For Prox-FDIAG, the parameter for Moreau envelope is set to λ = 1/(2L). The inner loop of Prox-FDIAG is run for R=20 iterations. For Adaptive Prox-FDIAG, the parameter λ is adaptively tuned using backtracking line search, starting with an initial λ0 = 1/(2L). For subgradient method, the step size is set to α = 1/(L k). |