reproducibilityindex.ai

Unraveling the Key Components of OOD Generalization via Diversification

Authors: Harold Luc Benoit, Liangze Jiang, Andrei Atanov, Oguzhan Fatih Kar, Mattia Rigotti, Amir Zamir

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	First, through theoretical and empirical analyses, we show that diversification methods are sensitive to the distribution of the unlabeled data (Fig. 1(a) vs. 1(b)). Specifically, each diversification method works best for different distributions of unlabeled data, and the performance drops significantly (up to 30% absolute accuracy) when diverging from the optimal distribution.
Researcher Affiliation	Collaboration	Harold Benoit ,1,2 Liangze Jiang ,1 Andrei Atanov ,1 Oguzhan Fatih Kar1 Mattia Rigotti2 Amir Zamir1 1Swiss Federal Institute of Technology (EPFL) 2IBM Research
Pseudocode	No	The paper provides mathematical formulations and descriptions of methods but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	Finally, we provide the anonymized source code for the experiments performed in the paper.
Open Datasets	Yes	In standard experiments (classification on Waterbirds and Office-Home datasets), using the second-best choice leads to an up to 20% absolute drop in accuracy. Specifically, we use M/C (Shah et al., 2020) and M/F (Pagliardini et al., 2023), which are datasets that concatenate one image from MNIST with one image from either CIFAR-10 (Krizhevsky & Hinton, 2009) or Fashion-MNIST (Xiao et al., 2017). We further show results on a large-scale real-world dataset, namely Celeb A-CC (Liu et al., 2015; Lee et al., 2023).
Dataset Splits	Yes	If not precised, all train, validation and test splits are taken as provided from Pagliardini et al. (2023); Lee et al. (2023) or WILDS. The best models are selected according to validation accuracy.
Hardware Specification	Yes	Each experiment can be run on a single A100 40GB GPU.
Software Dependencies	No	All results from Div Dis (Lee et al., 2023) and D-BAT (Pagliardini et al., 2023) are obtained using their respective published source code, ensuring a faithful representation of their methods. The paper mentions using specific codebases and libraries but does not provide specific version numbers for software dependencies like PyTorch, Python, or CUDA.
Experiment Setup	Yes	For Waterbirds variants, the optimizer is SGD, the number of epochs is 100, the learning rate is 0.001, the weight decay is 0.0001. The α parameter (referred as λ in Div Dis) was tuned over {0.1, 1, 10}. For Office-Home, the optimizer is SGD, the number of epochs is 50, the learning rate is 0.001, the weight decay is 0.0001, and the α parameter was tuned over {0.1, 1, 10}.