Improving Generative Adversarial Networks with Denoising Feature Matching
Authors: David Warde-Farley, Yoshua Bengio
ICLR 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate the hybrid criterion on the task of unsupervised image synthesis from datasets comprising a diverse set of visual categories, noting a qualitative and quantitative improvement in the objectness of the resulting samples. We show that this yields generators which consistently produce recognizable objects on the CIFAR10 dataset without the use of label information as in Salimans et al. (2016). We further investigate the criterion s performance on two larger and more diverse collections of images, and validate our qualitative observations quantitatively with the Inception score proposed in Salimans et al. (2016). 5 EXPERIMENTS We evaluate denoising feature matching on learning synthesis models from three datasets of increasing diversity and size: CIFAR-10, STL-10, and Image Net. Inception scores for our model and a baseline, consisting of the same architecture trained without denoising feature matching (both trained for 50 epochs), are shown in Table 2. |
| Researcher Affiliation | Academia | David Warde-Farley & Yoshua Bengio Montreal Institute for Learning Algorithms, CIFAR Senior Fellow Universit e de Montr eal Montreal, Quebec, Canada {david.warde-farley,yoshua.bengio}@umontreal.ca |
| Pseudocode | No | The paper describes mathematical formulations and processes but does not include any pseudocode or algorithm blocks. |
| Open Source Code | No | The paper thanks other researchers for making their code and model parameters available, but does not state that the authors themselves are releasing code for the work described in this paper. |
| Open Datasets | Yes | We evaluate denoising feature matching on learning synthesis models from three datasets of increasing diversity and size: CIFAR-10, STL-10, and Image Net. CIFAR-10 (Krizhevsky & Hinton, 2009) is a small, well-studied dataset consisting of 50,000 32 32 pixel RGB training images and 10,000 test images from 10 classes: airplane, automobile, bird, cat, deer, dog, frog, horse, ship, and truck. STL-10 (Coates et al., 2011) is a dataset consisting of a small labeled set and larger (100,000) unlabeled set of 96 96 RGB images. The unlabeled set is a subset of Image Net that is more diverse than CIFAR-10 (or the labeled set of STL-10), but less diverse than full Image Net. The Image Net database (Russakovsky et al., 2014) is a large-scale database of natural images. We train on the designated training set of the most widely used release, the 2012 Image Net Large Scale Visual Recognition Challenge (ILSVRC2012), consisting of a highly unbalanced split among 1,000 object classes. |
| Dataset Splits | Yes | CIFAR-10 (Krizhevsky & Hinton, 2009) is a small, well-studied dataset consisting of 50,000 32 32 pixel RGB training images and 10,000 test images from 10 classes: airplane, automobile, bird, cat, deer, dog, frog, horse, ship, and truck. |
| Hardware Specification | No | The paper mentions "computational resources" provided by the University of Montreal and Compute Canada, but does not specify any particular hardware components like CPU or GPU models. |
| Software Dependencies | Yes | We thank the University of Montreal and Compute Canada for the computational resources used for this investigation, as well as the authors of Theano (Al-Rfou et al., 2016), Blocks and Fuel (van Merri enboer et al., 2015). |
| Experiment Setup | Yes | In all experiments, we employ isotropic Gaussian corruption noise with σ = 1. Our generator and discriminator architectures follow the methods outlined in Radford et al. (2015). Accordingly, batch normalization (Ioffe & Szegedy, 2015) was used in the generator and discriminator in the same manner as Radford et al. (2015), and in all layers of the denoiser except the output layer. We calculate updates with respect to all losses with the parameters of all three networks fixed, and update all parameters simultaneously. All networks were trained with the Adam optimizer Kingma & Ba (2014) with a learning rate of 10 4 and β1 = 0.5. In our experiments, we set λdenoise to 0.03/nh, where nh is the number of discriminator hidden units fed as input to the denoiser; this division decouples the scale of the first term of (4) from the dimensionality of the representation used, reducing the need to adjust this hyperparameter simply because we altered the architecture of the discriminator. |