reproducibilityindex.ai

Training GANs with Optimism

Authors: Constantinos Daskalakis, Andrew Ilyas, Vasilis Syrgkanis, Haoyang Zeng

ICLR 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We apply OMD WGAN training to a bioinformatics problem of generating DNA sequences. We observe that models trained with OMD achieve consistently smaller KL divergence with respect to the true underlying distribution, than models trained with GD variants. Finally, we introduce a new algorithm, Optimistic Adam, which is an optimistic variant of Adam. We apply it to WGAN training on CIFAR10 and observe improved performance in terms of inception score as compared to Adam.
Researcher Affiliation	Collaboration	Constantinos Daskalakis MIT, EECS costis@mit.edu Andrew Ilyas MIT, EECS ailyas@mit.edu Vasilis Syrgkanis Microsoft Research vasy@microsoft.com Haoyang Zeng MIT, EECS haoyangz@mit.edu
Pseudocode	Yes	Algorithm 1 Optimistic ADAM, proposed algorithm for training WGANs on images.
Open Source Code	Yes	Code for our models is available at https://github.com/vsyrgkanis/optimistic_GAN_ training
Open Datasets	Yes	We apply optimism to training GANs for images and introduce the Optimistic Adam algorithm. We show that it achieves better performance than Adam, in terms of inception score, when trained on CIFAR10.
Dataset Splits	Yes	A random 10% of the sequences were held out as the validation set.
Hardware Specification	No	The paper does not provide specific details on the hardware used for experiments, such as GPU/CPU models, memory, or specific computing environments.
Software Dependencies	No	The paper mentions software like Adam and refers to common libraries implicitly through GAN architectures, but it does not specify version numbers for any software dependencies.
Experiment Setup	Yes	The same learning rate 0.0001 and betas (β1 = 0.5, β2 = 0.9) as in Appendix B of Gulrajani et al. (2017) were used for all the methods compared. We also matched other hyper-parameters such as gradient penalty coefﬁcient λ and batch size.