Unraveling the Key Components of OOD Generalization via Diversification
Authors: Harold Luc Benoit, Liangze Jiang, Andrei Atanov, Oguzhan Fatih Kar, Mattia Rigotti, Amir Zamir
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | First, through theoretical and empirical analyses, we show that diversification methods are sensitive to the distribution of the unlabeled data (Fig. 1(a) vs. 1(b)). Specifically, each diversification method works best for different distributions of unlabeled data, and the performance drops significantly (up to 30% absolute accuracy) when diverging from the optimal distribution. |
| Researcher Affiliation | Collaboration | Harold Benoit ,1,2 Liangze Jiang ,1 Andrei Atanov ,1 Oguzhan Fatih Kar1 Mattia Rigotti2 Amir Zamir1 1Swiss Federal Institute of Technology (EPFL) 2IBM Research |
| Pseudocode | No | The paper provides mathematical formulations and descriptions of methods but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Finally, we provide the anonymized source code for the experiments performed in the paper. |
| Open Datasets | Yes | In standard experiments (classification on Waterbirds and Office-Home datasets), using the second-best choice leads to an up to 20% absolute drop in accuracy. Specifically, we use M/C (Shah et al., 2020) and M/F (Pagliardini et al., 2023), which are datasets that concatenate one image from MNIST with one image from either CIFAR-10 (Krizhevsky & Hinton, 2009) or Fashion-MNIST (Xiao et al., 2017). We further show results on a large-scale real-world dataset, namely Celeb A-CC (Liu et al., 2015; Lee et al., 2023). |
| Dataset Splits | Yes | If not precised, all train, validation and test splits are taken as provided from Pagliardini et al. (2023); Lee et al. (2023) or WILDS. The best models are selected according to validation accuracy. |
| Hardware Specification | Yes | Each experiment can be run on a single A100 40GB GPU. |
| Software Dependencies | No | All results from Div Dis (Lee et al., 2023) and D-BAT (Pagliardini et al., 2023) are obtained using their respective published source code, ensuring a faithful representation of their methods. The paper mentions using specific codebases and libraries but does not provide specific version numbers for software dependencies like PyTorch, Python, or CUDA. |
| Experiment Setup | Yes | For Waterbirds variants, the optimizer is SGD, the number of epochs is 100, the learning rate is 0.001, the weight decay is 0.0001. The α parameter (referred as λ in Div Dis) was tuned over {0.1, 1, 10}. For Office-Home, the optimizer is SGD, the number of epochs is 50, the learning rate is 0.001, the weight decay is 0.0001, and the α parameter was tuned over {0.1, 1, 10}. |