reproducibilityindex.ai

On Gradient Descent Ascent for Nonconvex-Concave Minimax Problems

Authors: Tianyi Lin, Chi Jin, Michael Jordan

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we present several empirical results to show that two-time-scale GDA outperforms GDmax. The task is to train the empirical Wasserstein robustness model (WRM) (Sinha et al., 2018) over a collection of data samples {ξi}N i=1 with ℓ2-norm attack and a penalty parameter γ > 0.
Researcher Affiliation	Academia	1Department of Industrial Engineering and Operations Research, UC Berkeley 2Department of Electrical Engineering, Princeton University 3Department of Statistics and Electrical Engineering and Computer Science, UC Berkeley.
Pseudocode	Yes	Algorithm 1 Two-Time-Scale GDA
Open Source Code	No	The paper does not provide any explicit statement or link indicating that the source code for the methodology described in this paper is publicly available.
Open Datasets	Yes	We mainly follow the setting of Sinha et al. (2018) and consider training a neural network classiﬁer on three datasets1: MNIST, Fashion-MNIST, and CIFAR-10, with the default cross validation. 1https://keras.io/datasets/
Dataset Splits	No	While the paper mentions "with the default cross validation", it does not provide specific details on the dataset splits (e.g., exact percentages or sample counts for training, validation, and test sets).
Hardware Specification	No	The paper does not provide any specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments.
Software Dependencies	No	The paper does not provide specific software dependencies or their version numbers, such as programming language versions, deep learning framework versions, or library versions used for the experiments.
Experiment Setup	Yes	Small and large adversarial perturbation is set with γ {0.4, 1.3} as the same as Sinha et al. (2018). The baseline approach is denoted as GDm A in which ηx = ηy = 10 3 and each inner loop contains 20 gradient ascent. Two-time-scale GDA is denoted as GDA in which ηx = 5 10 5 and ηy = 10 3.