First Order Generative Adversarial Networks

Authors: Calvin Seward, Thomas Unterthiner, Urs Bergmann, Nikolay Jetchev, Sepp Hochreiter

ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We verify our method, the First Order GAN, with image generation on Celeb A, LSUN and CIFAR-10 and set a new state of the art on the One Billion Word language generation task. Finally in Section 7, the effectiveness of our method is demonstrated by generating images and texts. 7. Experimental Results
Researcher Affiliation Collaboration 1Zalando Research, Mühlenstraße 25, 10243 Berlin, Germany 2LIT AI Lab & Institute of Bioinformatics, Johannes Kepler University Linz, Austria.
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code Yes See Table 2, Appendix B.1 and released code 3. https://github.com/zalandoresearch/first_order_gan
Open Datasets Yes We verify our method, the First Order GAN, with image generation on Celeb A, LSUN and CIFAR-10 and set a new state of the art on the One Billion Word language generation task.
Dataset Splits No The paper mentions datasets used but does not provide specific train/validation/test split percentages, sample counts, or detailed methodology for data partitioning.
Hardware Specification No The paper does not provide specific hardware details such as GPU/CPU models, processor types, or memory used for running the experiments.
Software Dependencies No The paper does not provide specific version numbers for ancillary software components or libraries needed to replicate the experiment.
Experiment Setup Yes First, we train to minimize our divergence from Definition 6 with parameters λ = 0.1 and µ = 1.0 instead of the WGAN-GP divergence. Second, we use batch normalization in the generator, both for training our FOGAN method and the benchmark WGAN-GP; we do this because batch normalization improved performance and stability of both models. For our experiments we trained both models for 500,000 iterations in 5 independent runs, estimating the JSD between 6-grams of generated and real world data every 2000 training steps, see Figure 2.