Understanding Domain Generalization: A Noise Robustness Perspective
Authors: Rui Qiao, Bryan Kian Hsiang Low
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | However, additional comprehensive experiments on real-world benchmark datasets indicate that label-noise robustness does not necessarily translate to better performance compared to ERM. |
| Researcher Affiliation | Academia | Rui Qiao and Bryan Kian Hsiang Low Department of Computer Science, National University of Singapore rui.qiao@u.nus.edu,lowkh@comp.nus.edu.sg |
| Pseudocode | No | The paper describes algorithms and their derivations but does not include explicit pseudocode blocks or figures labeled 'Algorithm'. |
| Open Source Code | Yes | Our code is available at https://github.com/qiaoruiyt/Noise Robust DG |
| Open Datasets | Yes | Subpopulation shifts (Synthetic). CMNIST (Colored MNIST) (Arjovsky et al., 2019) is a synthetic binary classification dataset based on MNIST (Le Cun, 1998) with class label {0, 1}. ... Waterbirds (Wah et al., 2011; Sagawa et al., 2019) is a binary bird-type classification dataset adapted from CUB dataset (Wah et al., 2011) and Places dataset (Zhou et al., 2017). ... Celeb A (Liu et al., 2015; Sagawa et al., 2019) is a binary hair color prediction dataset adapted from Liu et al. (2015). ... Civil Comments (Borkan et al., 2019) is a binary text toxicity classification task. We used the version implemented in WILDS (Koh et al., 2021). ... PACS (Li et al., 2017) is a 7-class image classification dataset ... VLCS (Fang et al., 2013) is a 5-class image dataset ... Office-Home (Venkateswara et al., 2017) is a 65-class image classification dataset ... Terra Incognita (Beery et al., 2018) is a 10-class wild-animal classification dataset |
| Dataset Splits | Yes | We assume the availability of a small validation set from the test distribution for model selection. ... For the real-world subpopulation-shift datasets, we report the worst-group (WG) accuracy of the model on the test set selected according to the validation WG performance. |
| Hardware Specification | No | The paper states that ResNet-50 was used, pretrained on ImageNet, but does not provide specific hardware details such as GPU or CPU models, or memory specifications. |
| Software Dependencies | No | The paper mentions using 'Domainbed framework (Gulrajani & Lopez-Paz, 2020)' and 'Pytorch', but does not specify version numbers for these or other software dependencies. |
| Experiment Setup | Yes | We train all models for 5000 steps. for CMNIST, and 16 for the rest real-world datasets. We use a simple Convolutional Neural Network (CNN) for CMNIST, and Res Net 50 pretrained on Imagenet for the rest datasets. For all datasets, we add 10%, 25% label noise to test the performance of different algorithms. ... For Waterbirds, Celeb A, and their derivative Waterbirds+, Celeb A+ datasets, we perform standard data augmentation according to Sagawa et al. (2019) using Pytorch (This is slightly different from the default augmentation implemented by Domainbed). ... For the domain-shift datasets, we adopt Domainbed s data augmentation (Gulrajani & Lopez-Paz, 2020). ... Then, we report the average accuracy of the models across all environments selected using 20% of the test data without early stopping. |