InfoGAN-CR and ModelCentrality: Self-supervised Model Training and Selection for Disentangling GANs
Authors: Zinan Lin, Kiran Thekumparampil, Giulia Fanti, Sewoong Oh
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | For quantitative evaluation, we run experiments on synthetic datasets with pre-defined latent factors, including d Sprites (Matthey et al., 2017) and 3DTeapots (Eastwood & Williams, 2018). We evaluate disentanglement using the popular metrics from (Kim & Mnih, 2018; Eastwood & Williams, 2018; Kumar et al., 2017; Ridgeway & Mozer, 2018; Chen et al., 2018; Higgins et al., 2016). |
| Researcher Affiliation | Academia | 1Carnegie Mellon University 2University of Illinois at Urbana-Champaign 3University of Washington. |
| Pseudocode | Yes | Algorithm 1 Model Centrality Input: N pairs of generative models and latent code encoder: (G1, Q1), ..., (GN, QN), supervised disentanglement metric f : encoder model R Output: the estimated best model G |
| Open Source Code | Yes | The code for all experiments is available at https:// github.com/fjxmlzn/Info GAN-CR |
| Open Datasets | Yes | For quantitative evaluation, we run experiments on synthetic datasets with pre-defined latent factors, including d Sprites (Matthey et al., 2017) and 3DTeapots (Eastwood & Williams, 2018). We train Info GAN-CR on the Celeb A dataset of 202,599 celebrity facial images. |
| Dataset Splits | No | The paper mentions hyperparameter tuning and using holdout datasets for prior work, and discusses different hyperparameters (λ, α) and batch numbers for its own experiments, but it does not provide specific percentages or counts for training/validation/test splits needed for reproduction. |
| Hardware Specification | No | The paper mentions general resources like "Extreme Science and Engineering Discovery Environment (XSEDE)", the "Bridges system", and "AWS cloud computing credits", but it does not provide specific hardware details such as GPU models (e.g., NVIDIA V100), CPU types, or memory specifications. |
| Software Dependencies | No | The paper discusses various models and frameworks (e.g., GANs, VAEs, Info GAN) but does not provide a reproducible description of ancillary software with specific version numbers (e.g., Python 3.x, PyTorch 1.x, CUDA 10.x). |
| Experiment Setup | Yes | For non-negative scalars λ and α, this architecture is trained as min G,H,Q max D LAdv(G, D) λLInfo(G, Q) αLc(G, H) (4). For the progressive training curve, we use a contrastive gap of 1.9 for 120,000 batches, and then introduce a (more aggressive) gap of 0. For the no progressive training curves, we use gap size of 0 or 1.9 for all 230,400 batches. |