InfoGAN-CR and ModelCentrality: Self-supervised Model Training and Selection for Disentangling GANs

Authors: Zinan Lin, Kiran Thekumparampil, Giulia Fanti, Sewoong Oh

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental For quantitative evaluation, we run experiments on synthetic datasets with pre-defined latent factors, including d Sprites (Matthey et al., 2017) and 3DTeapots (Eastwood & Williams, 2018). We evaluate disentanglement using the popular metrics from (Kim & Mnih, 2018; Eastwood & Williams, 2018; Kumar et al., 2017; Ridgeway & Mozer, 2018; Chen et al., 2018; Higgins et al., 2016).
Researcher Affiliation Academia 1Carnegie Mellon University 2University of Illinois at Urbana-Champaign 3University of Washington.
Pseudocode Yes Algorithm 1 Model Centrality Input: N pairs of generative models and latent code encoder: (G1, Q1), ..., (GN, QN), supervised disentanglement metric f : encoder model R Output: the estimated best model G
Open Source Code Yes The code for all experiments is available at https:// github.com/fjxmlzn/Info GAN-CR
Open Datasets Yes For quantitative evaluation, we run experiments on synthetic datasets with pre-defined latent factors, including d Sprites (Matthey et al., 2017) and 3DTeapots (Eastwood & Williams, 2018). We train Info GAN-CR on the Celeb A dataset of 202,599 celebrity facial images.
Dataset Splits No The paper mentions hyperparameter tuning and using holdout datasets for prior work, and discusses different hyperparameters (λ, α) and batch numbers for its own experiments, but it does not provide specific percentages or counts for training/validation/test splits needed for reproduction.
Hardware Specification No The paper mentions general resources like "Extreme Science and Engineering Discovery Environment (XSEDE)", the "Bridges system", and "AWS cloud computing credits", but it does not provide specific hardware details such as GPU models (e.g., NVIDIA V100), CPU types, or memory specifications.
Software Dependencies No The paper discusses various models and frameworks (e.g., GANs, VAEs, Info GAN) but does not provide a reproducible description of ancillary software with specific version numbers (e.g., Python 3.x, PyTorch 1.x, CUDA 10.x).
Experiment Setup Yes For non-negative scalars λ and α, this architecture is trained as min G,H,Q max D LAdv(G, D) λLInfo(G, Q) αLc(G, H) (4). For the progressive training curve, we use a contrastive gap of 1.9 for 120,000 batches, and then introduce a (more aggressive) gap of 0. For the no progressive training curves, we use gap size of 0 or 1.9 for all 230,400 batches.