Spectral Normalization for Generative Adversarial Networks

Authors: Takeru Miyato, Toshiki Kataoka, Masanori Koyama, Yuichi Yoshida

ICLR 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We tested the efficacy of spectral normalization on CIFAR10, STL-10, and ILSVRC2012 dataset, and we experimentally confirmed that spectrally normalized GANs (SN-GANs) is capable of generating images of better or equal quality relative to the previous training stabilization techniques.
Researcher Affiliation Collaboration 1Preferred Networks, Inc. 2Ritsumeikan University 3National Institute of Informatics
Pseudocode Yes Algorithm 1 SGD with spectral normalization
Open Source Code Yes The code with Chainer (Tokui et al., 2015), generated images and pretrained models are available at https://github.com/pfnet-research/sngan_projection.
Open Datasets Yes We tested the efficacy of spectral normalization on CIFAR10 (Torralba et al., 2008), STL-10 (Coates et al., 2011), and ILSVRC2012 dataset (Russakovsky et al., 2015) as well.
Dataset Splits No The paper mentions 'train and validation sets' in Figure 15a and its description ('Learning curves of (a) critic loss and (b) inception score on different reparametrization method on CIFAR-10'), implying their use. However, it does not explicitly provide the specific percentages, sample counts, or methodology for the training, validation, and test data splits.
Hardware Specification No The paper does not provide specific hardware details such as GPU or CPU models used for running its experiments.
Software Dependencies No The paper mentions 'Chainer' as the framework used, citing 'Tokui et al., 2015', but it does not specify a version number for Chainer or any other software dependency.
Experiment Setup Yes For optimization, we used the Adam optimizer Kingma & Ba (2015) in all of our experiments. We tested with 6 settings for (1) ndis, the number of updates of the discriminator per one update of the generator and (2) learning rate α and the first and second order momentum parameters (β1, β2) of Adam. We list the details of these settings in Table 1 in the appendix section. We set dz to 128 for all of our experiments. For weight clipping, we followed the original work Arjovsky et al. (2017) and set the clipping constant c at 0.01 for the convolutional weight of each layer. For gradient penalty, we set λ to 10, as suggested in Gulrajani et al. (2017). We trained the networks with 450K generator updates, and applied linear decay for the learning rate after 400K iterations so that the rate would be 0 at the end.