Optimal Representations for Covariate Shift

Authors: Yangjun Ruan, Yann Dubois, Chris J. Maddison

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 6 EXPERIMENTS In our experiments, we aimed to: (i) verify our theoretical results in practice; (ii) investigate our proposed representation learning objectives in practical DG; (iii) take advantage of pretrained SSL models (in particular, CLIP) to achieve powerful models for DG. Unless stated otherwise, we consider a two-stage training setup. First, the representation learner ( the representor ) trains an encoder p Z | X using a specified objective and freezes it. Then, the person performing predictions ( the learner ) trains her predictor h from Z by minimizing the risk on source data. Finally, the representation Z and predictor h are evaluated on target data.
Researcher Affiliation Academia Yangjun Ruan 12, Yann Dubois 2, Chris J. Maddison12 1University of Toronto & 2Vector Institute {yjruan,yanndubois,cmaddis}@cs.toronto.edu
Pseudocode Yes Algorithm 1 CAD objective
Open Source Code Yes Our implementation is released at https://github.com/ryoungj/optdom.
Open Datasets Yes We used non-MNIST datasets on Domain Bed that were non-synthetic, including VLCS (Fang et al., 2013), PACS (Li et al., 2017), Office Home (Venkateswara et al., 2017), Terra Incognita (Beery et al., 2018), and Domain Net (Peng et al., 2019). and To do so we used LAION-400M (Schuhmann et al., 2021) that is a public dataset that contains 400M web-crawled image-text pairs.
Dataset Splits Yes Thus, we randomly split the PACS dataset to 80% training and 20% validation splits for each domain. The training splits were used to train both the encoder and the source predictor, and the validation splits were used for encoder and source predictor selection as well as evaluation on target domains.
Hardware Specification No The paper mentions 'computational budget' and 'sufficient compute' but does not provide specific details on the hardware used for experiments (e.g., GPU models, CPU types, or cloud instance specifications).
Software Dependencies No The paper mentions software tools and optimizers like 'Adam optimizer (Kingma & Ba, 2014)' and 'SVM classifier', but does not specify version numbers for any software dependencies required to reproduce the experiments.
Experiment Setup Yes The Res Net-18 encoder was trained to 300 epochs without any regularization, using the Adam optimizer (Kingma & Ba, 2014) with a learning rate of 5e-5, a batch size of 192 (48 for each domain), and a cosine learning rate decay schedule. and Learning rate: discrete set {1e-4, 3e-4, 1e-3, 3e-3} Batch size: discrete set {128, 256, 512} for Domain Net and Office Home, and {64, 128, 256} for other datasets MLP dropout: discrete set {0., 0.1, 0.5}