Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Training GANs with Optimism
Authors: Constantinos Daskalakis, Andrew Ilyas, Vasilis Syrgkanis, Haoyang Zeng
ICLR 2018 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We apply OMD WGAN training to a bioinformatics problem of generating DNA sequences. We observe that models trained with OMD achieve consistently smaller KL divergence with respect to the true underlying distribution, than models trained with GD variants. Finally, we introduce a new algorithm, Optimistic Adam, which is an optimistic variant of Adam. We apply it to WGAN training on CIFAR10 and observe improved performance in terms of inception score as compared to Adam. |
| Researcher Affiliation | Collaboration | Constantinos Daskalakis MIT, EECS EMAIL Andrew Ilyas MIT, EECS EMAIL Vasilis Syrgkanis Microsoft Research EMAIL Haoyang Zeng MIT, EECS EMAIL |
| Pseudocode | Yes | Algorithm 1 Optimistic ADAM, proposed algorithm for training WGANs on images. |
| Open Source Code | Yes | Code for our models is available at https://github.com/vsyrgkanis/optimistic_GAN_ training |
| Open Datasets | Yes | We apply optimism to training GANs for images and introduce the Optimistic Adam algorithm. We show that it achieves better performance than Adam, in terms of inception score, when trained on CIFAR10. |
| Dataset Splits | Yes | A random 10% of the sequences were held out as the validation set. |
| Hardware Specification | No | The paper does not provide specific details on the hardware used for experiments, such as GPU/CPU models, memory, or specific computing environments. |
| Software Dependencies | No | The paper mentions software like Adam and refers to common libraries implicitly through GAN architectures, but it does not specify version numbers for any software dependencies. |
| Experiment Setup | Yes | The same learning rate 0.0001 and betas (β1 = 0.5, β2 = 0.9) as in Appendix B of Gulrajani et al. (2017) were used for all the methods compared. We also matched other hyper-parameters such as gradient penalty coefficient λ and batch size. |