reproducibilityindex.ai

Unsupervised Concept Discovery Mitigates Spurious Correlations

Authors: Md Rifat Arefin, Yan Zhang, Aristide Baratin, Francesco Locatello, Irina Rish, Dianbo Liu, Kenji Kawaguchi

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Evaluation across the benchmark datasets for sub-population shifts demonstrate superior or competitive performance compared state-of-the-art baselines, without the need for group annotation. Code is available at https://github.com/rarefin/CoBalT
Researcher Affiliation	Collaboration	1Mila, University of Montreal, Canada 2Samsung SAIT AI Lab, Montreal, Canada 3Institute of Science and Technology Austria 4National University of Singapore. Correspondence to: Md Rifat Arefin <rifat.arefin@mila.quebec>.
Pseudocode	Yes	Algorithm 1 Batch Sampling Strategy
Open Source Code	Yes	Code is available at https://github.com/rarefin/CoBalT
Open Datasets	Yes	We train our model using the following publicly available datasets: CMNIST (Alain et al., 2015), Celeb A (Liu et al., 2014), Waterbirds (Sagawa et al., 2020), Urban Cars (Li et al., 2023), Background Challenge Image Net-9 (Xiao et al., 2021).
Dataset Splits	Yes	In our previous evaluations, we selected the model by early stopping based on the worst group validation performance, with the groups being inferred on the validation data by our proposed method.
Hardware Specification	Yes	All experiments were performed with NVIDIA A100 and V100 GPUs.
Software Dependencies	No	The paper mentions software like "PyTorch library (Paszke et al., 2019)" and optimizers like "Adam" and "SGD" but does not specify their version numbers.
Experiment Setup	Yes	For training the concept discovery model, we use Adam (Kingma & Ba, 2015) as an optimizer with a learning rate of 2e-4 and a weight decay of 5e-4 for 50 epochs with a batch size of 128. The same configuration is used for all data sets, except CMNIST and Celeb A, which are trained for 20 epochs. For CMNIST, the batch size is 32. We train classification models using SGD with 0.9 momentum for all datasets. The learning rate is 1e-4, except for CMNIST (1e-3). A weight decay of 0.1 is applied to Waterbirds, Celeb A, and Urban Cars. Training epochs: Waterbirds (300), Celeb A (60), IN-9L (100), Urban Cars (300), and CMNIST (20). The batch size is 128 for all datasets, except CMNIST (32).