Shuffling Gradient-Based Methods for Nonconvex-Concave Minimax Optimization

Authors: Quoc Tran Dinh, Trang H. Tran, Lam Nguyen

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We perform some experiments to illustrate Algorithm 1 and compare it with two existing and related algorithms. ... We test these algorithms on two datasets from LIBSVM [6]. ... The results are shown in Figure 1 for w8a and rcv1 datasets using kb = 32 blocks.
Researcher Affiliation Collaboration Quoc Tran-Dinh Department of Statistics and Operations Research The University of North Carolina at Chapel Hill quoctd@email.unc.edu Trang H. Tran School of OR and Information Engineering Cornell University, Ithaca, NY htt27@cornell.edu Lam M. Nguyen IBM Research, Thomas J. Watson Research Center Yorktown Heights, NY Lam Nguyen.MLTD@ibm.com
Pseudocode Yes Algorithm 1 (Shuffling Proximal Gradient-Based Algorithm for Solving (10))
Open Source Code Yes Our data is available online from LIBSVM. The code is implemented in Python. The code for all experiments is also provided with instruction.
Open Datasets Yes We test these algorithms on two datasets from LIBSVM [6].
Dataset Splits No The paper states that full details are in Supp. Doc. D, but the main text does not specify exact training/validation/test dataset splits (e.g., percentages or counts).
Hardware Specification Yes Our experiments were run on a Mac Book Pro. 2.8GHz Quad-Core Intel Core I7, 16Gb Memory specified at the beginning of Supp. Doc. D.
Software Dependencies No The paper states that “The code is implemented in Python.” but does not specify version numbers for Python or any specific libraries or solvers used.
Experiment Setup Yes We set λ := 10 4 and update the smooothing parameter γt as γt := 1 2(t+1)1/3 . The learning rate for all algorithms is finely tuned from {100, 50, 10, 5, 1, 0.5, 0.1, 0.05, 0.01, 0.001, 0.0001}, and the results are shown in Figure 1 for w8a and rcv1 datasets using kb = 32 blocks.