Which Training Methods for GANs do actually Converge?
Authors: Lars Mescheder, Andreas Geiger, Sebastian Nowozin
ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To test how well the gradient penalties from Section 4.1 perform on more complicated tasks, we train convolutional GANs on a variety of datasets, including a generative model for all 1000 Imagenet classes and a generative model for the celeb A-HQ dataset (Karras et al., 2017) at resolution 1024 x 1024. While we find that unregularized GAN training quickly leads to mode-collapse for these problems, our simple R1-regularizer enables stable training. Random samples from the models and more details on the experimental setup can be found in the supplementary material. |
| Researcher Affiliation | Collaboration | 1MPI Tübingen, Germany 2ETH Zürich, Switzerland 3Microsoft Research, Cambridge, UK. |
| Pseudocode | No | The paper discusses algorithms and mathematical formulations but does not contain a clearly labeled pseudocode or algorithm block. |
| Open Source Code | No | The paper does not contain an explicit statement or link indicating that the source code for the methodology described is publicly available. |
| Open Datasets | Yes | To test how well the gradient penalties from Section 4.1 perform on more complicated tasks, we train convolutional GANs on a variety of datasets, including a generative model for all 1000 Imagenet classes and a generative model for the celeb A-HQ dataset (Karras et al., 2017) at resolution 1024 x 1024. |
| Dataset Splits | No | The paper describes experiments on 2D problems and images, but it does not explicitly state the specific training, validation, or test data splits (e.g., percentages, sample counts, or references to standard splits with citations) needed for reproduction. |
| Hardware Specification | No | The acknowledgements section mentions "NVIDIA for donating the GPUs for the experiments presented in the supplementary material" but does not specify the exact GPU models or any other detailed hardware specifications. |
| Software Dependencies | No | The paper mentions "TensorFlow" in the bibliography but does not specify its version or any other software dependencies with their respective version numbers used in the experiments. |
| Experiment Setup | No | For 2D problems, the paper states they "try both stochastic gradient descent and RMS-Prop with 4 different learning rates" and "3 different regularization parameters" and "train all methods for 50k iterations", but it does not explicitly list the specific values of these hyperparameters or other system-level training settings used for the reported results, making it difficult to reproduce the exact setup. |