Shuffling Gradient-Based Methods for Nonconvex-Concave Minimax Optimization
Authors: Quoc Tran Dinh, Trang H. Tran, Lam Nguyen
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We perform some experiments to illustrate Algorithm 1 and compare it with two existing and related algorithms. ... We test these algorithms on two datasets from LIBSVM [6]. ... The results are shown in Figure 1 for w8a and rcv1 datasets using kb = 32 blocks. |
| Researcher Affiliation | Collaboration | Quoc Tran-Dinh Department of Statistics and Operations Research The University of North Carolina at Chapel Hill quoctd@email.unc.edu Trang H. Tran School of OR and Information Engineering Cornell University, Ithaca, NY htt27@cornell.edu Lam M. Nguyen IBM Research, Thomas J. Watson Research Center Yorktown Heights, NY Lam Nguyen.MLTD@ibm.com |
| Pseudocode | Yes | Algorithm 1 (Shuffling Proximal Gradient-Based Algorithm for Solving (10)) |
| Open Source Code | Yes | Our data is available online from LIBSVM. The code is implemented in Python. The code for all experiments is also provided with instruction. |
| Open Datasets | Yes | We test these algorithms on two datasets from LIBSVM [6]. |
| Dataset Splits | No | The paper states that full details are in Supp. Doc. D, but the main text does not specify exact training/validation/test dataset splits (e.g., percentages or counts). |
| Hardware Specification | Yes | Our experiments were run on a Mac Book Pro. 2.8GHz Quad-Core Intel Core I7, 16Gb Memory specified at the beginning of Supp. Doc. D. |
| Software Dependencies | No | The paper states that “The code is implemented in Python.” but does not specify version numbers for Python or any specific libraries or solvers used. |
| Experiment Setup | Yes | We set λ := 10 4 and update the smooothing parameter γt as γt := 1 2(t+1)1/3 . The learning rate for all algorithms is finely tuned from {100, 50, 10, 5, 1, 0.5, 0.1, 0.05, 0.01, 0.001, 0.0001}, and the results are shown in Figure 1 for w8a and rcv1 datasets using kb = 32 blocks. |