The Numerics of GANs

Authors: Lars Mescheder, Sebastian Nowozin, Andreas Geiger

NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimentally, we demonstrate its superiority on training common GAN architectures and show convergence on GAN architectures that are known to be notoriously hard to train.
Researcher Affiliation Collaboration Lars Mescheder Autonomous Vision Group MPI Tübingen lars.mescheder@tuebingen.mpg.de Sebastian Nowozin Machine Intelligence and Perception Group Microsoft Research sebastian.nowozin@microsoft.com Andreas Geiger Autonomous Vision Group MPI Tübingen andreas.geiger@tuebingen.mpg.de
Pseudocode Yes Algorithm 1 Simultaneous Gradient Ascent (Sim GA) and Algorithm 2 Consensus optimization
Open Source Code Yes 1The code for all experiments in this paper is available under https://github.com/LMescheder/ The Numerics Of GANs.
Open Datasets Yes CIFAR-10 and Celeb A In our second experiment, we apply our method to the cifar-10 and celeb A datasets
Dataset Splits No The paper references datasets and training processes but does not provide specific details on training, validation, or test data splits (e.g., percentages or counts).
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments. It only mentions that the code is available for experiments.
Software Dependencies No The paper mentions deep learning frameworks like TensorFlow [1] and PyTorch [19] but does not specify the version numbers used in the experiments.
Experiment Setup Yes For both the generator and critic we use fully connected neural networks with 4 hidden layers and 16 hidden units in each layer. For all layers, we use RELU-nonlinearities. We use a 16-dimensional Gaussian prior for the latent code z and set up the game between the generator and critic using the utility functions as in [10]. To test our method, we run both Sim GA and our method with RMSProp and a learning rate of 10 4 for 20000 steps. For our method, we use a regularization parameter of γ = 10. [...] using a DC-GAN-like architecture [21] without batch normalization in the generator or the discriminator. For celeb A, we additionally use a constant number of filters in each layer and add additional RESNET-layers.