Unsupervised Concept Discovery Mitigates Spurious Correlations

Authors: Md Rifat Arefin, Yan Zhang, Aristide Baratin, Francesco Locatello, Irina Rish, Dianbo Liu, Kenji Kawaguchi

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Evaluation across the benchmark datasets for sub-population shifts demonstrate superior or competitive performance compared state-of-the-art baselines, without the need for group annotation. Code is available at https://github.com/rarefin/CoBalT
Researcher Affiliation Collaboration 1Mila, University of Montreal, Canada 2Samsung SAIT AI Lab, Montreal, Canada 3Institute of Science and Technology Austria 4National University of Singapore. Correspondence to: Md Rifat Arefin <rifat.arefin@mila.quebec>.
Pseudocode Yes Algorithm 1 Batch Sampling Strategy
Open Source Code Yes Code is available at https://github.com/rarefin/CoBalT
Open Datasets Yes We train our model using the following publicly available datasets: CMNIST (Alain et al., 2015), Celeb A (Liu et al., 2014), Waterbirds (Sagawa et al., 2020), Urban Cars (Li et al., 2023), Background Challenge Image Net-9 (Xiao et al., 2021).
Dataset Splits Yes In our previous evaluations, we selected the model by early stopping based on the worst group validation performance, with the groups being inferred on the validation data by our proposed method.
Hardware Specification Yes All experiments were performed with NVIDIA A100 and V100 GPUs.
Software Dependencies No The paper mentions software like "PyTorch library (Paszke et al., 2019)" and optimizers like "Adam" and "SGD" but does not specify their version numbers.
Experiment Setup Yes For training the concept discovery model, we use Adam (Kingma & Ba, 2015) as an optimizer with a learning rate of 2e-4 and a weight decay of 5e-4 for 50 epochs with a batch size of 128. The same configuration is used for all data sets, except CMNIST and Celeb A, which are trained for 20 epochs. For CMNIST, the batch size is 32. We train classification models using SGD with 0.9 momentum for all datasets. The learning rate is 1e-4, except for CMNIST (1e-3). A weight decay of 0.1 is applied to Waterbirds, Celeb A, and Urban Cars. Training epochs: Waterbirds (300), Celeb A (60), IN-9L (100), Urban Cars (300), and CMNIST (20). The batch size is 128 for all datasets, except CMNIST (32).