Dualing GANs

Authors: Yujia Li, Alexander Schwing, Kuan-Chieh Wang, Richard Zemel

NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we empirically study the proposed dual GAN algorithms. In particular, we show the stable and monotonic training for linear discriminators and study its properties. For nonlinear GANs we show good quality samples and compare it with standard GAN training methods. Overall the results show that our proposed approaches work across a range of problems and provide good alternatives to the standard GAN training method. We explore the dual GAN with linear discriminator on a synthetic 2D dataset generated by sampling points from a mixture of 5 2D Gaussians, as well as the MNIST [12] dataset.
Researcher Affiliation Collaboration Yujia Li1 Alexander Schwing3 Kuan-Chieh Wang1,2 Richard Zemel1,2 1Department of Computer Science, University of Toronto 2Vector Institute 3Department of Electrical and Computer Engineering, University of Illinois at Urbana-Champaign {yujiali, wangkua1, zemel}@cs.toronto.edu aschwing@illinois.edu Now at Deep Mind.
Pseudocode Yes Algorithm 1 GAN optimization with model function. Initialize θ, w0, k = 0 and iterate 1. One or few gradient ascent steps on f(θ, wk) w.r.t. generator parameters θ 2. Find step s using mins mk,θ(s) s.t. 1 2 s 2 2 k 3. Update wk+1 wk + s 4. k k + 1
Open Source Code No The paper does not provide any explicit statement about making the source code available or include a link to a code repository.
Open Datasets Yes We explore the dual GAN with linear discriminator on a synthetic 2D dataset generated by sampling points from a mixture of 5 2D Gaussians, as well as the MNIST [12] dataset. Next we assess the applicability of our proposed technique for non-linear discriminators, and focus on training models on MNIST and CIFAR-10 [11].
Dataset Splits No The paper mentions using mini-batches for optimization and discusses the 'minibatch size' in Table 1, but it does not specify explicit training/validation/test dataset splits (e.g., percentages or counts) for reproducibility.
Hardware Specification No The paper does not specify any details about the hardware (e.g., GPU models, CPU types, memory) used to run the experiments.
Software Dependencies No The paper mentions using 'Adam [9]' and 'Ipopt [20]' but does not provide specific version numbers for these or any other software dependencies, which is required for reproducibility.
Experiment Setup Yes The dual GAN formulation has a single hyper-parameter C, but we found the algorithm not to be sensitive to it, and set it to 0.0001 in all experiments. We used Adam [9] with fixed learning rate and momentum to optimize the generator. Table 1: Ranges of hyperparameters for sensitivity experiment. randint[a,b] means samples were drawn from uniformly distributed integers in the closed interval of [a,b], similarly rand[a,b] for real numbers. enr([a,b]) is shorthand for exp(-randint[a,b]), which was used for hyperparameters commonly explored in log-scale. For generator architectures, for the 5-Gaussians dataset we tried 2 3-layer fully-connected networks, with 20 and 40 hidden units. For MNIST, we tried 2 3-layer fully-connected networks, with 256 and 1024 hidden units, and a DCGAN-like architecture with and without batch normalization.