reproducibilityindex.ai

Discover and Cure: Concept-aware Mitigation of Spurious Correlation

Authors: Shirley Wu, Mert Yuksekgonul, Linjun Zhang, James Zou

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Across systematic experiments, DISC provides superior generalization ability and interpretability than the existing approaches. Specifically, it outperforms the state-of-the-art methods on an object recognition task and a skin-lesion classification task by 7.5% and 9.6%, respectively.
Researcher Affiliation	Academia	1Department of Computer Science, Stanford University. 2Department of Statistics, Rutgers University.
Pseudocode	Yes	Algorithm 1 Pseudocode of DISC
Open Source Code	Yes	Code and data are available at https://github.com/Wuyxin/DISC.
Open Datasets	Yes	We summarize the datasets in Appendix C. We consider image classification tasks with various types of spurious correlations. Specifically, Waterbirds (Sagawa et al., 2020) associates each class with water or land backgrounds, and Meta Shift (Liang & Zou, 2022) constructs disjoint spurious attributes for each class. We also use FMo W from Wilds Benchmark (Koh et al., 2021) where satellite images are collected from different geographical regions that contribute to potential spurious correlations. Moreover, we consider a challenging task, ISIC (Codella et al., 2019), which classifies dermoscopic images of skin lesions into benign or melanoma.
Dataset Splits	Yes	Appendix C. Datasets... Meta Shift: # Train data: 231 380 145 367 # Val data (OOD): 34 47 # Test data: 201 259... Waterbirds: # Train data: 3,498 (73%) 184 (4%) 56 (1%) 1,057 (22%) # Val data: 467 466 133 133 # Test data: 2,255 2,255 642 642... FMo W: # Train data: 34,816 17,809 20,973 1,582 1,641 # Val data: 7,732 4,121 6,562 803 693 # Test data: 5,858 4,963 8,024 2,593 666... ISIC: # Train data: 1,826 # Val data: 154 # Test data: 618
Hardware Specification	No	The paper does not provide specific details about the hardware (e.g., GPU models, CPU types, or cloud compute instances) used to run the experiments.
Software Dependencies	Yes	All the concept images are synthetic and generated by the Stable Diffusion model with the pretrained weights stable-diffusion-v1-4
Experiment Setup	Yes	We summarize the hyperparameters in Appendix E and use Gaussian Mixture Model (GMM) as the clustering algorithm. ... Table 5. Hyper-parameters of DISC during training. Leaning Rate Batch Size Weight Decay #Clusters per Class Meta Shift 5e-4 16 1e-4 2 Waterbirds 1e-4 32 1e-4 3 FMo W 1e-4 10 0.0 3 ISIC 5e-4 16 1e-5 3. For the Beta distribution, we use α = µ = 2 in all the datasets.