How robust is unsupervised representation learning to distribution shift?
Authors: Yuge Shi, Imant Daunhawer, Julia E Vogt, Philip Torr, Amartya Sanyal
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We verify this by extensively evaluating the performance of SSL and AE on both synthetic and realistic distribution shift datasets. |
| Researcher Affiliation | Academia | Yuge Shi Department of Engineering Science University of Oxford Imant Daunhawer & Julia E. Vogt Department of Computer Science ETH Zurich Philip H.S. Torr Department of Engineering Science University of Oxford Amartya Sanyal Department of Computer Science & ETH AI Center ETH Zurich |
| Pseudocode | No | The paper provides architectural descriptions in tables but does not include any sections or figures explicitly labeled 'Pseudocode' or 'Algorithm'. |
| Open Source Code | No | The paper states 'Our code is developed on the amazing solo-learn code base (da Costa et al., 2022)...' which refers to using an existing library, but does not provide any statement or link for the release of their own source code for the described methodology. |
| Open Datasets | Yes | We evaluate our models on two synthetic datasets, namely MNIST-CIFAR (Shah et al. (2020); see section 3.1.1) and Cd Sprites (Shi et al. (2022); see section 3.1.2), as well as two realistic datasets from WILDS (Koh et al., 2021): Camelyon17 and FMo W (see section 3.2.1). |
| Dataset Splits | Yes | ID train, test: Contains 90% and 10% of the original ID train split, respectively; OOD train, test: Contains 10% and 90% of the original OOD test split, respectively; OOD validation: Same as the original OOD validation split. Following WILDS, we use OOD validation set to perform early stopping and choose hyperparameters; |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU models, CPU types, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions using the 'solo-learn code base (da Costa et al., 2022)' but does not provide specific version numbers for this or other critical software dependencies like Python, PyTorch, or CUDA. |
| Experiment Setup | Yes | We perform a hyperparameter search on learning rate, scheduler, optimiser, representation size, etc. for each model. For hyperparameters including batch size, max epoch and model selection criteria, we follow the same protocol as in WILDS (Koh et al., 2021): for Camelyon17 we use a batch size of 32, train all models for 10 epochs and select the model that results in the highest accuracy on the validation set, and for FMo W the batch size is 32, max epoch is 60 and model selection criteria is worst group accuracy on OOD validation set. |