Weakly Supervised Disentanglement with Guarantees
Authors: Rui Shu, Yining Chen, Abhishek Kumar, Stefano Ermon, Ben Poole
ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To address this issue, we provide a theoretical framework to assist in analyzing the disentanglement guarantees (or lack thereof) conferred by weak supervision when coupled with learning algorithms based on distribution matching. We empirically verify the guarantees and limitations of several weak supervision methods (restricted labeling, match-pairing, and rank-pairing), demonstrating the predictive power and usefulness of our theoretical framework. |
| Researcher Affiliation | Collaboration | Rui Shu , Yining Chen , Abhishek Kumar , Stefano Ermon & Ben Poole Stanford University, Google Brain {ruishu,cynnjjs,ermon}@stanford.edu {abhishk,pooleb}@google.com |
| Pseudocode | No | The paper does not contain any blocks explicitly labeled as "Pseudocode" or "Algorithm". |
| Open Source Code | Yes | Code available at https://github.com/google-research/google-research/tree/master/weak disentangle |
| Open Datasets | Yes | We conducted experiments on five prominent datasets in the disentanglement literature: Shapes3D (Kim & Mnih, 2018), d Sprites (Higgins et al., 2017), Scream-d Sprites (Locatello et al., 2019), Small NORB (Le Cun et al., 2004), and Cars3D (Reed et al., 2015). |
| Dataset Splits | No | The paper mentions using several datasets for experiments but does not explicitly provide details about training, validation, or test splits (e.g., percentages or sample counts). |
| Hardware Specification | No | The paper does not specify any particular hardware components such as GPU models, CPU types, or cloud computing instance details used for running the experiments. |
| Software Dependencies | No | The paper mentions using "PyTorch" and "Keras" for initialization schemes but does not provide specific version numbers for these or any other software dependencies. |
| Experiment Setup | Yes | For all models, we use the Adam optimizer with β1 = 0.5, β2 = 0.999 and set the generator learning rate to 1e-3. We use a batch size of 64 and set the leaky Re LU negative slope to 0.2. Our results are collected over a broad range of hyperparameter configurations (see Appendix H for details). |