Hybrid Variance-Reduced SGD Algorithms For Minimax Problems with Nonconvex-Linear Function

Authors: Quoc Tran Dinh, Deyi Liu, Lam Nguyen

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate the benefits of our algorithms over existing methods through two numerical examples, including a nonsmooth and nonconvex-non-strongly concave minimax model. We use two examples to illustrate our algorithm and compare it with existing methods. The performance of 3 algorithms are shown in Figure 1 for three datasets using b := N/8 (8 blocks).
Researcher Affiliation Collaboration Department of Statistics and Operations Research The University of North Carolina at Chapel Hill, Chapel Hill, NC 27599 Emails: {quoctd@email.unc.edu, deyi.liu@live.unc.edu} IBM Research, Thomas J. Watson Research Center Yorktown Heights, NY10598, USA. Email: lamnguyen.mltd@ibm.com
Pseudocode Yes Algorithm 1 (Smoothing Hybrid Variance-Reduced SGD Algorithm for solving (1))
Open Source Code No The paper does not provide concrete access to source code (specific repository link, explicit code release statement, or code in supplementary materials) for the methodology described in this paper.
Open Datasets Yes We test it on three real-world portfolio datasets, which contain 29, 37, and 47 portfolios, respectively, from the Keneth R. French Data Library [1]. We test them on 3 datasets from LIBSVM [6].
Dataset Splits No The paper uses datasets from Keneth R. French Data Library and LIBSVM but does not provide specific dataset split information (exact percentages, sample counts, citations to predefined splits, or detailed splitting methodology) needed to reproduce the data partitioning.
Hardware Specification Yes Our code is implemented in Python 3.6.3, running on a Linux desktop (3.6GHz Intel Core i7 and 16Gb memory).
Software Dependencies Yes Our code is implemented in Python 3.6.3
Experiment Setup Yes We set ρ := 0.2 and λ := 0.01 as in [44]... The step-size η of all algorithms are well tuned from a set of trials {1, 0.5, 0.1, 0.05, 0.01, 0.001, 0.0001}. We set λ := 10 4 and update our γt parameter as γt := 1 2(t+1)1/3 . The step-size η of all algorithms are well tuned from {1, 0.5, 0.1, 0.05, 0.01, 0.001, 0.0001}...