How robust is unsupervised representation learning to distribution shift?

Authors: Yuge Shi, Imant Daunhawer, Julia E Vogt, Philip Torr, Amartya Sanyal

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We verify this by extensively evaluating the performance of SSL and AE on both synthetic and realistic distribution shift datasets.
Researcher Affiliation Academia Yuge Shi Department of Engineering Science University of Oxford Imant Daunhawer & Julia E. Vogt Department of Computer Science ETH Zurich Philip H.S. Torr Department of Engineering Science University of Oxford Amartya Sanyal Department of Computer Science & ETH AI Center ETH Zurich
Pseudocode No The paper provides architectural descriptions in tables but does not include any sections or figures explicitly labeled 'Pseudocode' or 'Algorithm'.
Open Source Code No The paper states 'Our code is developed on the amazing solo-learn code base (da Costa et al., 2022)...' which refers to using an existing library, but does not provide any statement or link for the release of their own source code for the described methodology.
Open Datasets Yes We evaluate our models on two synthetic datasets, namely MNIST-CIFAR (Shah et al. (2020); see section 3.1.1) and Cd Sprites (Shi et al. (2022); see section 3.1.2), as well as two realistic datasets from WILDS (Koh et al., 2021): Camelyon17 and FMo W (see section 3.2.1).
Dataset Splits Yes ID train, test: Contains 90% and 10% of the original ID train split, respectively; OOD train, test: Contains 10% and 90% of the original OOD test split, respectively; OOD validation: Same as the original OOD validation split. Following WILDS, we use OOD validation set to perform early stopping and choose hyperparameters;
Hardware Specification No The paper does not provide specific details about the hardware (e.g., GPU models, CPU types, memory) used for running the experiments.
Software Dependencies No The paper mentions using the 'solo-learn code base (da Costa et al., 2022)' but does not provide specific version numbers for this or other critical software dependencies like Python, PyTorch, or CUDA.
Experiment Setup Yes We perform a hyperparameter search on learning rate, scheduler, optimiser, representation size, etc. for each model. For hyperparameters including batch size, max epoch and model selection criteria, we follow the same protocol as in WILDS (Koh et al., 2021): for Camelyon17 we use a batch size of 32, train all models for 10 epochs and select the model that results in the highest accuracy on the validation set, and for FMo W the batch size is 32, max epoch is 60 and model selection criteria is worst group accuracy on OOD validation set.