reproducibilityindex.ai

Stochastic Recursive Gradient Descent Ascent for Stochastic Nonconvex-Strongly-Concave Minimax Problems

Authors: Luo Luo, Haishan Ye, Zhichao Huang, Tong Zhang

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct the experiments by using distributionally robust optimization with nonconvex regularized logistic loss [5, 14, 21, 46]. Given dataset {(ai, bi)}n i=1 where ai Rd is the feature of i-th sample and bi {1, 1} the corresponding label, the minimax formulation is: min x Rd max y Y f(x, y) 1 n Pn i=1 yili(x) V (y) + g(x) , li(x) = log(1 + exp( bia i x)), g is the nonconvex regularizer [5]: αx2 i 1 + αx2 i , 2λ1 ny 1 2 2 and Y = {y Rn : 0 yi 1, Pn i=1 yi = 1} is a simplex. Following Yan et al. [46], Kohler and Lucchi [21] s settings, we let λ1 = 1/n2, λ2 = 10 3 and α = 10 for experiments. We evaluate compared the performance of SREDA with baseline algorithms GDAmax, GDA, SGDA [25] and Minimax PPA [26] on six real-world data sets a9a , w8a , gisette , mushrooms , sido0 and rcv1 , whose details are listed in Table 2.
Researcher Affiliation	Academia	Luo Luo1 Haishan Ye2 Zhichao Huang1 Tong Zhang1 1Department of Mathematics, The Hong Kong University of Science and Technology 2Shenzhen Research Institute of Big Data, The Chinese University of Hong Kong, Shenzhen
Pseudocode	Yes	Algorithm 1 SGDmax; Algorithm 2 SGDA; Algorithm 3 SREDA; Algorithm 4 Concave Maximizer; Algorithm 5 SREDA (Finite-sum Case)
Open Source Code	No	The paper does not provide an explicit statement or link to its open-source code.
Open Datasets	Yes	We evaluate compared the performance of SREDA with baseline algorithms GDAmax, GDA, SGDA [25] and Minimax PPA [26] on six real-world data sets a9a , w8a , gisette , mushrooms , sido0 and rcv1 , whose details are listed in Table 2. The dataset sido0 comes from Causality Workbench2 and the others can be downloaded from LIBSVM repository3.
Dataset Splits	No	The paper lists the datasets used but does not provide specific train/validation/test splits, percentages, or cross-validation details for reproducibility.
Hardware Specification	Yes	Our experiments are conducted on a workstation with Intel Xeon Gold 5120 CPU and 256GB memory.
Software Dependencies	Yes	We use MATLAB 2018a to run the code and the operating system is Ubuntu 18.04.4 LTS.
Experiment Setup	Yes	The parameters of the algorithms are chosen as follows: The stepsizes of all algorithms are tuned from {10 3, 10 2, 10 1, 1} and we keep the stepsize ratio is {10, 102, 103}. For stochastic algorithms SGDA and SREDA, the mini-batch size is set with {10, 100, 200}. For SREDA, we use the ﬁnite-sum version (Algorithm 5 with the ﬁrst case of Theorem 2) and let q = m = n/S2 heuristically. The initialization of SREDA is based on PSARAH with K0 = 5, b = 1 and m = 20. For Minimax PPA, we tune the proximal parameter from {1, 10, 100} and momentum parameter from {0.2, 0.5, 0.7}. Each inner loop of Minimax PPA has ﬁve times Maximin-AG2 which contains ﬁve AGD iterations.