reproducibilityindex.ai

Learning Robust Classifiers with Self-Guided Spurious Correlation Mitigation

Authors: Guangtao Zheng, Wenqian Ye, Aidong Zhang

IJCAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate that training the classifier to distinguish different prediction behaviors reduces its reliance on spurious correlations without knowing them a priori and outperforms prior methods on five realworld datasets. We demonstrate that LBC achieves the best performance on five real-world datasets where spurious correlations are unknown or unavailable. 5 Experiment 5.1 Datasets Waterbirds [Sagawa et al., 2019] is a dataset for recognizing waterbirds and landbirds. ... Celeb A [Liu et al., 2015] ... Image Net-9 [Xiao et al., 2021] ... Image Net-A [Hendrycks et al., 2021] ... NICO [He et al., 2021]... Table 1: Worst-group and average accuracy (%) comparison with state-of-the-art methods on the Celeb A and Waterbirds datasets. 5.6 Ablation Study We first analyzed the effectiveness of the four proposed components...
Researcher Affiliation	Academia	Guangtao Zheng , Wenqian Ye and Aidong Zhang University of Virginia {gz5hp, pvc7hs, aidong}@virginia.edu
Pseudocode	Yes	The whole procedure is listed in Algorithm 1 in Appendix.
Open Source Code	Yes	Our code is available at https://github.com/gtzheng/LBC.
Open Datasets	Yes	Waterbirds [Sagawa et al., 2019] is a dataset for recognizing waterbirds and landbirds... Celeb A [Liu et al., 2015] is a large-scale image dataset of celebrity faces... Image Net-9 [Xiao et al., 2021] comprises images with different background and foreground signals... Image Net-A [Hendrycks et al., 2021] is a dataset of realworld images... NICO [He et al., 2021] is designed for non-independent and identically distributed and out-of-distribution image classification...
Dataset Splits	Yes	Given a validation set Dval without group labels, we develop a selection metric called pseudo unbiased validation accuracy Accpseudo unbiased to select the best model during training.
Hardware Specification	Yes	All experiments are conducted on NVIDIA RTX 8000 GPUs.
Software Dependencies	No	The paper mentions using "Spacy" and a "vision transformer" and "GPT-2 language model" but does not provide specific version numbers for these software components.
Experiment Setup	Yes	We set the learning rate to 0.001 which decays following a cosine annealing scheduler and use an SDG optimizer with 0.9 momentum and 10 4 weight decay. Then, we use the ERM-trained models as the initial models for our LBC training. Standard data augmentations are used in LBC to effectively mitigate spurious correlations that are not typically captured by VLMs, such as sizes and orientations. For all the datasets, we fix the learning rate to 0.0001 and the batch size to 128. We sample 20 batches per epoch and train for 50 epochs. The cluster size K is set to 3.