reproducibilityindex.ai

Learning Debiased Classifier with Biased Committee

Authors: Nayeong Kim, SEHYUN HWANG, Sungsoo Ahn, Jaesik Park, Suha Kwak

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	On five real-world datasets, our method outperforms prior arts using no spurious attribute label like ours and even surpasses those relying on bias labels occasionally. Our code is available at https://github.com/nayeong-v-kim/LWBC.
Researcher Affiliation	Academia	Pohang University of Science and Technology (POSTECH), South Korea {nayeong.kim, sehyun03, sungsoo.ahn, jaesik.park, suha.kwak}@postech.ac.kr
Pseudocode	Yes	Algorithm 1 Learning a debiased classifier with a biased committee
Open Source Code	Yes	Our code is available at https://github.com/nayeong-v-kim/LWBC.
Open Datasets	Yes	Celeb A. Celeb A [31] is a dataset for face recognition where each sample is labeled with 40 attributes. Image Net-9. Image Net-9 [20] is a subset of Image Net [35] containing nine super-classes. Image Net-A. Image Net-A [17] contains real-world images misclassified by an Image Net-trained Res Net 50 [15]. BAR. The Biased Action Recognition (BAR) dataset [32] is a real-world image dataset intentionally designed to exhibit spurious correlations between human action and place on its images. NICO. NICO [16] is a real-world dataset for simulating out-of-distribution image classification scenarios.
Dataset Splits	Yes	Following the setting adopted by Bahng et al. [3], we conduct experiments with 54,600 training images and 2,100 validation images. In our setting, we use 10% of the original BAR training set as validation and set the bias-conflicting ratio of the training set to 1%. The validation and test sets consist of 7 seen context classes and 3 unseen context classes per object class.
Hardware Specification	No	The paper does not explicitly describe the specific hardware used (e.g., GPU model, CPU type) for running its experiments.
Software Dependencies	No	The paper does not provide specific version numbers for software components or libraries (e.g., PyTorch version, CUDA version) needed for replication.
Experiment Setup	Yes	We set the batch size to {64, 64, 128, 256}, learning rate to {1e-3, 1e-3, 1e-4, 6e-3}, the size of the committee m to {30, 30, 30, 40}, the size of subset Sl to {10, 10, 80, 300}, λ to {0.9, 0.6, 0.6, 0.6}, and τ to {1, 1, 1, 2.5}, respectively for {BAR, NICO, Imagenet-9, Celeb A}, and α to 0.02. Note that we run LWBC on 3 random seeds and report the average and the standard deviation.