First Order Generative Adversarial Networks
Authors: Calvin Seward, Thomas Unterthiner, Urs Bergmann, Nikolay Jetchev, Sepp Hochreiter
ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We verify our method, the First Order GAN, with image generation on Celeb A, LSUN and CIFAR-10 and set a new state of the art on the One Billion Word language generation task. Finally in Section 7, the effectiveness of our method is demonstrated by generating images and texts. 7. Experimental Results |
| Researcher Affiliation | Collaboration | 1Zalando Research, Mühlenstraße 25, 10243 Berlin, Germany 2LIT AI Lab & Institute of Bioinformatics, Johannes Kepler University Linz, Austria. |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | See Table 2, Appendix B.1 and released code 3. https://github.com/zalandoresearch/first_order_gan |
| Open Datasets | Yes | We verify our method, the First Order GAN, with image generation on Celeb A, LSUN and CIFAR-10 and set a new state of the art on the One Billion Word language generation task. |
| Dataset Splits | No | The paper mentions datasets used but does not provide specific train/validation/test split percentages, sample counts, or detailed methodology for data partitioning. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU/CPU models, processor types, or memory used for running the experiments. |
| Software Dependencies | No | The paper does not provide specific version numbers for ancillary software components or libraries needed to replicate the experiment. |
| Experiment Setup | Yes | First, we train to minimize our divergence from Definition 6 with parameters λ = 0.1 and µ = 1.0 instead of the WGAN-GP divergence. Second, we use batch normalization in the generator, both for training our FOGAN method and the benchmark WGAN-GP; we do this because batch normalization improved performance and stability of both models. For our experiments we trained both models for 500,000 iterations in 5 independent runs, estimating the JSD between 6-grams of generated and real world data every 2000 training steps, see Figure 2. |