Optimal Epoch Stochastic Gradient Descent Ascent Methods for Min-Max Optimization
Authors: Yan Yan, Yi Xu, Qihang Lin, Wei Liu, Tianbao Yang
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | In this paper, we bridge this gap by providing a sharp analysis of epoch-wise stochastic gradient descent ascent method (referred to as Epoch-GDA) for solving strongly convex strongly concave (SCSC) min-max problems, without imposing any additional assumption about smoothness or the function s structure. To the best of our knowledge, our result is the first one that shows Epoch-GDA can achieve the optimal rate of O(1/T) for the duality gap of general SCSC min-max problems. We emphasize that such generalization of Epoch-GD for strongly convex minimization problems to Epoch-GDA for SCSC min-max problems is non-trivial and requires novel technical analysis. |
| Researcher Affiliation | Collaboration | Yan Yan School of EECS Washington State University yanyan.1@wsu.edu Yi Xu Machine Intelligence Technology Alibaba Group US Inc statxy@gmail.com Qihang Lin Department of Business Analytics University of Iowa qihang-lin@uiowa.edu Wei Liu Tencent AI Lab wl2223@columbia.edu Tianbao Yang Department of CS University of Iowa tianbao-yang@uiowa.edu |
| Pseudocode | Yes | Algorithm 1 Epoch-GDA for SCSC Min-Max Problems |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described. |
| Open Datasets | No | The paper is theoretical and does not use or reference any datasets for training. |
| Dataset Splits | No | The paper is theoretical and does not describe dataset splits for validation. |
| Hardware Specification | No | The paper is theoretical and does not mention any specific hardware used for experiments. |
| Software Dependencies | No | The paper is theoretical and does not provide specific ancillary software details with version numbers. |
| Experiment Setup | No | The paper is theoretical and does not describe an experimental setup with hyperparameters or system-level training settings. |