reproducibilityindex.ai

Understanding Domain Generalization: A Noise Robustness Perspective

Authors: Rui Qiao, Bryan Kian Hsiang Low

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	However, additional comprehensive experiments on real-world benchmark datasets indicate that label-noise robustness does not necessarily translate to better performance compared to ERM.
Researcher Affiliation	Academia	Rui Qiao and Bryan Kian Hsiang Low Department of Computer Science, National University of Singapore rui.qiao@u.nus.edu,lowkh@comp.nus.edu.sg
Pseudocode	No	The paper describes algorithms and their derivations but does not include explicit pseudocode blocks or figures labeled 'Algorithm'.
Open Source Code	Yes	Our code is available at https://github.com/qiaoruiyt/Noise Robust DG
Open Datasets	Yes	Subpopulation shifts (Synthetic). CMNIST (Colored MNIST) (Arjovsky et al., 2019) is a synthetic binary classification dataset based on MNIST (Le Cun, 1998) with class label {0, 1}. ... Waterbirds (Wah et al., 2011; Sagawa et al., 2019) is a binary bird-type classification dataset adapted from CUB dataset (Wah et al., 2011) and Places dataset (Zhou et al., 2017). ... Celeb A (Liu et al., 2015; Sagawa et al., 2019) is a binary hair color prediction dataset adapted from Liu et al. (2015). ... Civil Comments (Borkan et al., 2019) is a binary text toxicity classification task. We used the version implemented in WILDS (Koh et al., 2021). ... PACS (Li et al., 2017) is a 7-class image classification dataset ... VLCS (Fang et al., 2013) is a 5-class image dataset ... Office-Home (Venkateswara et al., 2017) is a 65-class image classification dataset ... Terra Incognita (Beery et al., 2018) is a 10-class wild-animal classification dataset
Dataset Splits	Yes	We assume the availability of a small validation set from the test distribution for model selection. ... For the real-world subpopulation-shift datasets, we report the worst-group (WG) accuracy of the model on the test set selected according to the validation WG performance.
Hardware Specification	No	The paper states that ResNet-50 was used, pretrained on ImageNet, but does not provide specific hardware details such as GPU or CPU models, or memory specifications.
Software Dependencies	No	The paper mentions using 'Domainbed framework (Gulrajani & Lopez-Paz, 2020)' and 'Pytorch', but does not specify version numbers for these or other software dependencies.
Experiment Setup	Yes	We train all models for 5000 steps. for CMNIST, and 16 for the rest real-world datasets. We use a simple Convolutional Neural Network (CNN) for CMNIST, and Res Net 50 pretrained on Imagenet for the rest datasets. For all datasets, we add 10%, 25% label noise to test the performance of different algorithms. ... For Waterbirds, Celeb A, and their derivative Waterbirds+, Celeb A+ datasets, we perform standard data augmentation according to Sagawa et al. (2019) using Pytorch (This is slightly different from the default augmentation implemented by Domainbed). ... For the domain-shift datasets, we adopt Domainbed s data augmentation (Gulrajani & Lopez-Paz, 2020). ... Then, we report the average accuracy of the models across all environments selected using 20% of the test data without early stopping.