Learning Debiased Classifier with Biased Committee

Authors: Nayeong Kim, SEHYUN HWANG, Sungsoo Ahn, Jaesik Park, Suha Kwak

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental On five real-world datasets, our method outperforms prior arts using no spurious attribute label like ours and even surpasses those relying on bias labels occasionally. Our code is available at https://github.com/nayeong-v-kim/LWBC.
Researcher Affiliation Academia Pohang University of Science and Technology (POSTECH), South Korea {nayeong.kim, sehyun03, sungsoo.ahn, jaesik.park, suha.kwak}@postech.ac.kr
Pseudocode Yes Algorithm 1 Learning a debiased classifier with a biased committee
Open Source Code Yes Our code is available at https://github.com/nayeong-v-kim/LWBC.
Open Datasets Yes Celeb A. Celeb A [31] is a dataset for face recognition where each sample is labeled with 40 attributes. Image Net-9. Image Net-9 [20] is a subset of Image Net [35] containing nine super-classes. Image Net-A. Image Net-A [17] contains real-world images misclassified by an Image Net-trained Res Net 50 [15]. BAR. The Biased Action Recognition (BAR) dataset [32] is a real-world image dataset intentionally designed to exhibit spurious correlations between human action and place on its images. NICO. NICO [16] is a real-world dataset for simulating out-of-distribution image classification scenarios.
Dataset Splits Yes Following the setting adopted by Bahng et al. [3], we conduct experiments with 54,600 training images and 2,100 validation images. In our setting, we use 10% of the original BAR training set as validation and set the bias-conflicting ratio of the training set to 1%. The validation and test sets consist of 7 seen context classes and 3 unseen context classes per object class.
Hardware Specification No The paper does not explicitly describe the specific hardware used (e.g., GPU model, CPU type) for running its experiments.
Software Dependencies No The paper does not provide specific version numbers for software components or libraries (e.g., PyTorch version, CUDA version) needed for replication.
Experiment Setup Yes We set the batch size to {64, 64, 128, 256}, learning rate to {1e-3, 1e-3, 1e-4, 6e-3}, the size of the committee m to {30, 30, 30, 40}, the size of subset Sl to {10, 10, 80, 300}, λ to {0.9, 0.6, 0.6, 0.6}, and τ to {1, 1, 1, 2.5}, respectively for {BAR, NICO, Imagenet-9, Celeb A}, and α to 0.02. Note that we run LWBC on 3 random seeds and report the average and the standard deviation.