Chroma-VAE: Mitigating Shortcut Learning with Generative Classifiers
Authors: Wanqian Yang, Polina Kirichenko, Micah Goldblum, Andrew G. Wilson
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate the effectiveness of Chroma-VAE on several benchmark datasets (Section 5). In Section 5.1, we present results on the Colored MNIST benchmark... In Section 5.2, we apply Chroma-VAE to two large-scale benchmark datasets (Celeb A and MNIST-Fashion MNIST Dominoes), as well as a real-world problem involving pneumonia prediction using chest X-ray scans. |
| Researcher Affiliation | Academia | New York University {wanqian, pk1822, goldblum}@nyu.edu, andrewgw@cims.nyu.edu |
| Pseudocode | Yes | Algorithm 1 Chroma-VAE |
| Open Source Code | Yes | Our code is publicly available at https://github.com/Wanqianxn/chroma-vae-public. |
| Open Datasets | Yes | Setup. Following Arjovsky et al. [1], (i) first we binarize MNIST [26] labels... We consider two benchmark Celeb A tasks:... We also consider the MF-Dominoes dataset [35]... Likewise, we make a training set (N = 20K) with roughly equal numbers of X-rays from the National Institutes of Health Clinical Center (NIH) dataset [48] and the Chexpert dataset [18] |
| Dataset Splits | Yes | For Colored MNIST and MF-Dominoes, we follow the same experimental setup as [27]. We then perform a random split with 80% for training and 20% for testing. For CelebA, we use the splits provided by the dataset directly. For the Chest X-ray dataset, we combine NIH [48] and Chexpert [18] to make a training set (N = 20K) and validation set (N = 5K) for training, and a fixed test set (N = 1.6K) for testing. |
| Hardware Specification | No | The paper states 'We note that the computational resources we used are typical of most research using the same models and datasets' but does not provide specific details such as GPU models, CPU types, or memory specifications used for the experiments. |
| Software Dependencies | No | The paper mentions 'Adam' as an optimization method, citing Kingma and Ba [22], but does not specify any particular software, libraries, or their version numbers (e.g., PyTorch 1.x, TensorFlow 2.x, Python 3.x) used for implementation. |
| Experiment Setup | Yes | Detailed experimental setups can be found in Appendix B. For example, in Section 5.1 (Colored MNIST setup), it mentions 'pd = 0.25' and 'pc = 0.1'. In Section 5.2, it discusses using the 'harder setting from Lee et al. [27]' for Celeb A and MF-Dominoes. Algorithm 1 also outlines the training procedure, including 'initialize (θ, ϕ, φ) using Adam'. |