reproducibilityindex.ai

Minimax Optimization with Smooth Algorithmic Adversaries

Authors: Tanner Fiez, Chi Jin, Praneeth Netrapalli, Lillian J Ratliff

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	This section presents empirical results evaluating our SGD algorithm (Algorithm 1) for generative adversarial networks (Goodfellow et al., 2014) and adversarial training (Madry et al., 2018). Our results demonstrate that our framework results in stable monotonic improvement during training and converges to desirable solutions in both GAN and adversarial training problems.
Researcher Affiliation	Collaboration	Tanner Fiez , Lillian J. Ratliff University of Washington, Seattle {fiezt, ratliffl}@uw.edu Chi Jin Princeton University chij@princeton.edu Praneeth Netrapalli Google Research, India pnetrapalli@google.com
Pseudocode	Yes	Algorithm 1: Stochastic subgradient descent (SGD)
Open Source Code	Yes	The code for the experiments is included in the supplementary material with instructions on how to run.
Open Datasets	Yes	We run an adversarial training experiment with the MNIST dataset
Dataset Splits	No	The paper mentions using a 'training set' and evaluating on 'test classiﬁcation accuracy' but does not provide specific percentages or sample counts for training, validation, or test splits. It refers to 'standard adversarial training' but does not specify the splits used.
Hardware Specification	Yes	For the experiments with neural network models we used two Nvidia Ge Force GTX 1080 Ti GPU and the Py Torch higher library(Deleu et al., 2019) to compute f(θ, A(θ)).
Software Dependencies	No	The paper mentions 'Py Torch higher library(Deleu et al., 2019)' but does not provide specific version numbers for PyTorch or the 'higher' library.
Experiment Setup	Yes	The learning rates for both the generator and the discriminator are η = 0.01. The minimization procedure has a ﬁxed learning rate of η1 = 0.0001 and the maximization procedure runs for T = 10 steps with a ﬁxed learning rate of η2 = 4. We compare Algorithm 1 with usual adversarial training (Madry et al., 2018) which descends θf(θ, A(θ)) instead of f(θ, A(θ)), and a baseline of standard training without adversarial training. For each algorithm, we train for 100 passes over the training set using a batch size of 50.