Improving GANs Using Optimal Transport

Authors: Tim Salimans, Han Zhang, Alec Radford, Dimitris Metaxas

ICLR 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimentally we show OT-GAN to be highly stable when trained with large mini-batches, and we present state-of-the-art results on several popular benchmark problems for image generation. ... In this section, we demonstrate the improved stability and consistency of the proposed method on five different datasets with increasing complexity.
Researcher Affiliation Collaboration Tim Salimans Open AI tim@openai.com Han Zhang Rutgers University han.zhang@cs.rutgers.edu Alec Radford Open AI alec@openai.com Dimitris Metaxas Rutgers University dnm@cs.rutgers.edu
Pseudocode Yes Algorithm 1 Optimal Transport GAN (OT-GAN) training algorithm with step size α, using minibatch SGD for simplicity ... Algorithm 2 Conditional Optimal Transport GAN (OT-GAN) training algorithm with step size α, using minibatch SGD for simplicity
Open Source Code No The paper does not provide a direct link to open-source code or explicitly state that the code for the methodology is released.
Open Datasets Yes CIFAR-10 is a well-studied dataset of 32 × 32 color images for generative models (Krizhevsky, 2009). ... train OT-GAN to generate 128 × 128 images on the dog subset of Image Net (Russakovsky et al., 2015).
Dataset Splits No The paper discusses training and evaluating on datasets like CIFAR-10 and ImageNet but does not explicitly state the specific train/validation/test split ratios or sample counts for these datasets.
Hardware Specification No To reach the large batch sizes needed for optimal performance we make use of multi GPU training. In this work we only use up to 8 GPUs per experiment... All experiments in this paper, except for the mixture of Gaussians toy example, were performed using 8 GPUs and trained for several days. (The paper mentions the number of GPUs but not specific models or other hardware specifications.)
Software Dependencies No We train the model using Adam with a learning rate of 3 × 10−4, β1 = 0.5, β2 = 0.999. (The paper mentions the optimizer and activation functions used, along with some hyperparameters, but does not specify software library versions like PyTorch, TensorFlow, or Python.)
Experiment Setup Yes We train the model using Adam with a learning rate of 3 × 10−4, β1 = 0.5, β2 = 0.999. We update the generator 3 times for every critic update. ... Our model and the other reported results are trained in an unsupervised manner. ... As shown in Figure 3, training is not very stable when the batch size is small (i.e. 200). As batch size increases, training becomes more stable and the inception score of samples increases. ... Our algorithm for training generative models can be generalized to include conditional generation of images given some side information s, such as a text-description of the image or a label. ... A smaller batch size of 2048 is used due to GPU memory contraints.