Mode Regularized Generative Adversarial Networks

Authors: Tong Che, Yanran Li, Athul Jacob, Yoshua Bengio, Wenjie Li

ICLR 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We perform two classes of experiments on MNIST. For the MNIST dataset, we can assume that the data generating distribution can be approximated with ten dominant modes, if we define the term mode here as a connected component of the data manifold. Table 2: Results for Compositional MNIST with 1000 modes. The proposed regularization (Reg DCGAN) allows to substantially reduce the number of missed modes as well as the KL divergence that measures the plausibility of the generated samples (like in the Inception score). Table 3: Number of images on the missing modes on Celeb A estimated by a third-party discriminator.
Researcher Affiliation Academia Montreal Institute for Learning Algorithms, Universit e de Montr eal, Montr eal, QC H3T 1J4, Canada Department of Computing, The Hong Kong Polytechnic University, Hong Kong David R. Cheriton School of Computer Science, University Of Waterloo, Waterloo, ON N2L 3G1, Canada
Pseudocode Yes A APPENDIX: PSEUDO CODE FOR MDGAN In this Appendix, we give the detailed training procedure of an MDGAN example we discuss in Section 3.3. Figure 8: The detailed training procedure of an MDGAN example.
Open Source Code No The paper makes no explicit statement about making the source code for their proposed methods publicly available. It only references external code repositories for comparison models.
Open Datasets Yes We perform two classes of experiments on MNIST. For the MNIST dataset, we can assume that the data generating distribution can be approximated with ten dominant modes, if we define the term mode here as a connected component of the data manifold. We also test the effectiveness of our proposal on harder problems, we implement an encoder for the DCGAN algorithm and train our model with different hyper-parameters together with the DCGAN baseline on the Celeb A dataset. However, in datasets without labels (LSUN)...
Dataset Splits No The paper mentions "grid search" (Table 1) and "training data" for MNIST and Celeb A, but does not provide specific percentages or counts for training, validation, or test splits. It does not clearly describe how the data was partitioned for validation purposes in a reproducible manner.
Hardware Specification No The paper does not provide specific details about the hardware used for running the experiments (e.g., CPU, GPU models, memory).
Software Dependencies No The paper mentions using the Adam optimizer but does not specify any software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes Table 1: Grid Search for Hyperparameters. n Layer G [2,3,4] n Layer D [2,3,4] size G [400,800,1600,3200] size D [256, 512, 1024] dropout D [True,False] optim G [SGD,Adam] optim D [SGD,Adam] lr [1e-2,1e-3,1e-4]. In the regularized version, we choose λ1 = λ2 = 0.005.