Unrolled Generative Adversarial Networks
Authors: Luke Metz, Ben Poole, David Pfau, Jascha Sohl-Dickstein
ICLR 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section we demonstrate improved mode coverage and stability by applying this technique to five datasets of increasing complexity. |
| Researcher Affiliation | Collaboration | Luke Metz Google Brain lmetz@google.com Ben Poole Stanford University poole@cs.stanford.edu David Pfau Google Deep Mind pfau@google.com Jascha Sohl-Dickstein Google Brain jaschasd@google.com |
| Pseudocode | No | The paper references 'Algorithm 2 in (Maclaurin et al., 2015)' for a clear description of differentiation through gradient descent, but it does not include its own pseudocode or algorithm block. |
| Open Source Code | Yes | We provide a reference implementation of this technique at github.com/poolio/unrolled gan. |
| Open Datasets | Yes | To illustrate the impact of discriminator unrolling, we train a simple GAN architecture on a 2D mixture of 8 Gaussians arranged in a circle. [...] In this experiment we try to generate MNIST samples using an LSTM [...] Here we test our technique on a more traditional convolutional GAN architecture and task, similar to those used in (Radford et al., 2015; Salimans et al., 2016). In the previous experiments we tested models where the standard GAN training algorithm would not converge. In this section we improve a standard model by reducing its tendency to engage in mode collapse. We ran 4 configurations of this model, varying the number of unrolling steps to be 0, 1, 5, or 10. Each configuration was run 5 times with different random seeds. For full training details see Appendix D. |
| Dataset Splits | No | The paper mentions training and testing on datasets like MNIST and CIFAR10, but it does not specify explicit validation splits (e.g., percentages or counts for training, validation, and test sets) or the use of cross-validation. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running the experiments (e.g., GPU models, CPU types, or cloud instance specifications). |
| Software Dependencies | No | The paper mentions optimizers like Adam and initialization methods like Xavier, but does not provide specific version numbers for any software dependencies or libraries used for the implementation. |
| Experiment Setup | Yes | For a detailed list of architecture and hyperparameters see Appendix A. [...] The generator network consists of a fully connected network with 2 hidden layers of size 128 with relu activations followed by a linear projection to 2 dimensions. All weights are initialized to be orthogonal with scaling of 0.8. The discriminator network first scales its input down by a factor of 4 (to roughly scale to (-1,1)), followed by 1 layer fully connected network with relu activations to a linear layer to of size 1 to act as the logit. [...] Both networks are optimized using Adam (Kingma & Ba, 2014) with a learning rate of 1e-4 and β1=0.5. The network is trained by alternating updates of the generator and the discriminator. One step consists of either G or D updating. (Appendix A) |