A Single-Loop Smoothed Gradient Descent-Ascent Algorithm for Nonconvex-Concave Min-Max Problems

Authors: Jiawei Zhang, Peijun Xiao, Ruoyu Sun, Zhiquan Luo

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We illustrate the practical efficiency of the stabilized GDA algorithm on robust training. 5 Numerical Results on Robust Neural Network Training. Table 2: Test accuracies under FGSM and PGD attacks. Figure 1: Convergence speed of Smoothed-GDA and the algorithm in [20].
Researcher Affiliation Academia Jiawei Zhang 216019001@link.cuhk.edu.cn Peijun Xiao peijunx2@illinois.edu Ruoyu Sun ruoyus@illinois.edu Zhi-Quan Luo luozq@cuhk.edu.cn Shenzhen Research Institute of Big Data, School of science and engineering, The Chinese University of Hong Kong, Shenzhen, China Coordinated Science Laboratory, Department of ISE, University of Illinois at Urbana-Champaign, Urbana, IL
Pseudocode Yes Algorithm 1 GDA 1: Initialize x0, y0; 2: Choose c, > 0; 3: for t = 0, 1, 2, . . . , do 4: xt+1 = PX(xt crxf(xt, yt)); 5: yt+1 = PY (yt + ryf(xt+1, yt)); 6: end for Algorithm 2 Smoothed-GDA 1: Initialize x0, z0, y0 and 0 < β 1. 2: for t = 0, 1, 2, . . . , do 3: xt+1 = PX(xt crx K(xt, zt; yt)); 4: yt+1 = PY (yt+ ry K(xt+1, zt; yt)); 5: zt+1 = zt + β(xt+1 zt), 6: end for Algorithm 3 Smoothed Block Gradient Descent Ascent (Smoothed-BGDA) 1: Initialize x0, z0, y0; 2: for t = 0, 1, 2, . . . , do 3: for i = 1, 2, . . . , N do 4: xt+1 i crxi K(xt+1 N, zt; yt)); 5: end for 6: yt+1 = PY (yt + ry K(xt+1, zt; yt)); 7: zt+1 = zt + β(xt+1 zt), where 0 < β 1; 8: end for
Open Source Code No No explicit statement regarding the release of source code for the described methodology or a link to a code repository was found.
Open Datasets Yes In this section, we apply the Smoothed-GDA algorithm to train a robust neural network on MNIST data set against adversarial attacks [3,31,32].
Dataset Splits No The paper does not explicitly state the specific training, validation, and test splits used for the MNIST dataset. It refers to standard adversarial training but does not provide details on data partitioning.
Hardware Specification No No specific hardware details (e.g., GPU model, CPU model, memory, or cloud instance types) used for running the experiments were mentioned.
Software Dependencies No No specific software dependencies with version numbers (e.g., Python 3.8, PyTorch 1.9) were mentioned in the paper.
Experiment Setup No The paper states, "The details of this formulation and the structure of the network in experiments are provided in the appendix." However, the main text itself does not provide specific hyperparameters or system-level training settings.