Learning Debiased Representation via Disentangled Feature Augmentation
Authors: Jungsoo Lee, Eungyeup Kim, Juyoung Lee, Jihyeon Lee, Jaegul Choo
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | This paper first presents an empirical analysis revealing that training with diverse bias-conflicting samples beyond a given training set is crucial for debiasing as well as the generalization capability. Based on this observation, we propose a novel feature-level data augmentation technique in order to synthesize diverse bias-conflicting samples. We achieve the state-of-the-art performances in two synthetic datasets (i.e., Colored MNIST and Corrupted CIFAR-10) and one real-world dataset (i.e., Biased FFHQ) against existing baselines. |
| Researcher Affiliation | Collaboration | 1KAIST AI, 2Kakao Enterprise, South Korea |
| Pseudocode | Yes | Algorithm 1 Debiasing with disentangled feature augmentation |
| Open Source Code | No | The paper does not provide a direct link to open-source code or an explicit statement about its release in the main text. It mentions 'We include the remaining implementation details in Section D in the supplementary material' but this is not a clear statement about code availability. |
| Open Datasets | Yes | Colored MNIST is a modified MNIST dataset [13] with the color bias. Corrupted CIFAR-10 has ten different types of texture bias applied in CIFAR-10 [24] dataset... Biased FFHQ (BFFHQ) is curated from FFHQ dataset [28] |
| Dataset Splits | Yes | By adjusting the number of bias-conflicting data samples in the training set, we obtain four different datasets with the ratio of bias-conflicting samples of 0.5%, 1%, 2%, and 5%. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory, cloud instances) used for running the experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers. |
| Experiment Setup | Yes | For the training, we set the batch size of 256 for Colored MNIST and Corrupted CIFAR-10, respectively, and 64 for BFFHQ. Bias-conflicting augmentation is scheduled to be applied after 10K iterations for all datasets. |