Improved Techniques for Training GANs
Authors: Tim Salimans, Ian Goodfellow, Wojciech Zaremba, Vicki Cheung, Alec Radford, Xi Chen, Xi Chen
NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Using our new techniques, we achieve state-of-the-art results in semi-supervised classification on MNIST, CIFAR-10 and SVHN. The generated images are of high quality as confirmed by a visual Turing test: our model generates MNIST samples that humans cannot distinguish from real data, and CIFAR-10 samples that yield a human error rate of 21.3%. We also present Image Net samples with unprecedented resolution and show that our methods enable the model to learn recognizable features of Image Net classes. |
| Researcher Affiliation | Industry | Tim Salimans tim@openai.com Ian Goodfellow ian@openai.com Wojciech Zaremba woj@openai.com Vicki Cheung vicki@openai.com Alec Radford alec@openai.com Xi Chen peter@openai.com |
| Pseudocode | No | The paper describes methods and processes verbally and with diagrams, but it does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks. |
| Open Source Code | Yes | All code and hyperparameters may be found at https://github.com/openai/improved-gan. |
| Open Datasets | Yes | We performed semi-supervised experiments on MNIST, CIFAR-10 and SVHN, and sample generation experiments on MNIST, CIFAR-10, SVHN and Image Net. |
| Dataset Splits | Yes | The MNIST dataset contains 60, 000 labeled images of digits. We perform semi-supervised training with a small randomly picked fraction of these, considering setups with 20, 50, 100, and 200 labeled examples. Results are averaged over 10 random subsets of labeled data, each chosen to have a balanced number of examples from each class. The remaining training images are provided without labels. |
| Hardware Specification | No | We extensively modified a publicly available implementation of DCGANs2 using Tensor Flow [28] to achieve high performance, using a multi-GPU implementation. However, no specific GPU model or other hardware specs are given. |
| Software Dependencies | No | We extensively modified a publicly available implementation of DCGANs2 using Tensor Flow [28] to achieve high performance, using a multi-GPU implementation. No version numbers for TensorFlow or any other specific library/solver are provided. |
| Experiment Setup | Yes | Our networks have 5 hidden layers each. We use weight normalization [21] and add Gaussian noise to the output of each layer of the discriminator. (MNIST) and For the discriminator in our GAN we use a 9 layer deep convolutional network with dropout and weight normalization. The generator is a 4 layer deep CNN with batch normalization. (CIFAR-10) |