Stabilizing Adversarial Nets with Prediction Methods

Authors: Abhay Yadav, Sohil Shah, Zheng Xu, David Jacobs, Tom Goldstein

ICLR 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We show, both in theory and practice, that the proposed method reliably converges to saddle points, and is stable with a wider range of training parameters than a non-prediction method. This makes adversarial networks less likely to collapse, and enables faster training with larger learning rates. We present a wide range of experiments to demonstrate the benefits of the proposed prediction step for adversarial nets. We consider a saddle point problem on a toy dataset constructed using MNIST images, and then move on to consider state-of-the-art models for three tasks: GANs, domain adaptation, and learning of fair classifiers. Additional results, and additional experiments involving mixtures of Gaussians, are presented in the Appendix.
Researcher Affiliation Academia Abhay Yadav , Sohil Shah , Zheng Xu, David Jacobs, & Tom Goldstein University of Maryland, College Park, MD 20740, USA. {jaiabhay, xuzh, tomg}@cs.umd.edu, sohilas@umd.edu, djacobs@umiacs.umd.edu
Pseudocode Yes Prediction Method uk+1 = uk αk L u(uk, vk) | gradient descent in u, starting at (uk, vk) uk+1 = uk+1 + (uk+1 uk) | predict future value of u vk+1 = vk + βk L v( uk+1, vk) | gradient ascent in v, starting at ( uk+1, vk) .
Open Source Code Yes The code is available at https: //github.com/jaiabhayk/stable GAN.
Open Datasets Yes We consider a saddle point problem on a toy dataset constructed using MNIST images, and then move on to consider state-of-the-art models for three tasks: GANs, domain adaptation, and learning of fair classifiers. ... CIFAR10 (Krizhevsky, 2009)... OFFICE dataset (Saenko et al., 2010)... The Adult dataset from the UCI machine learning repository is used
Dataset Splits Yes We randomly split data into 35,000 samples for training, 5000 for validation and 5000 for testing.
Hardware Specification Yes Using a single Titan X Pascal, a training epoch of DCGAN takes 35 secs.
Software Dependencies No The paper mentions software like "Caffe (Jia et al., 2014)", "Adam optimizer", and "RMSProp", but it does not specify version numbers for these or any other key software components used in the experiments.
Experiment Setup Yes All the approaches were trained for five random seeds and 100 epochs each. ... using the default solver for DCGAN (the Adam optimizer) with learning rate=0.0002 and β1=0.5. ... trained with 5 higher learning rate (0.001) (the default for the Adam solver). ... adam solver with its default parameters (i.e., learning rate = 0.001, β1 = 0.9, β2 = 0.999) and with input batch size of 512. ... two different input batch sizes, a small (64) and a large batch (6144) setting.