Improving Out-of-Distribution Robustness via Selective Augmentation

Authors: Huaxiu Yao, Yu Wang, Sai Li, Linjun Zhang, Weixin Liang, James Zou, Chelsea Finn

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we conduct comprehensive experiments to evaluate the effectiveness of LISA. Specifically, we aim to answer the following questions: Q1: Compared to prior methods, can LISA improve robustness to subpopulation shifts and domain shifts (Section 4.1 and Section 4.2)? Q2: Which aspects of LISA are the most important for improving robustness (Section 4.3)? Q3: Does LISA successfully produce more invariant predictors (Section 4.4)? Q4: How does LISA perform with varying degrees of distribution shifts (Section 4.5)?
Researcher Affiliation Academia 1Stanford University, CA, USA 2University of California San Diego, CA, USA 3Renmin University of China, Beijing, China 4Rutgers University, NJ, USA.
Pseudocode Yes Algorithm 1 Training Procedure of LISA
Open Source Code Yes Code is released in https://github.com/huaxiuyao/LISA
Open Datasets Yes We classify MNIST digits from 2 classes... The data sizes of train, validation, and test sets are 30000, 10000, and 20000, respectively. Follow (Arjovsky et al., 2019), we flip labels with probability 0.25.
Dataset Splits Yes The data sizes of train, validation, and test sets are 30000, 10000, and 20000, respectively.
Hardware Specification No The paper mentions using 'pre-trained Res Net-50' and 'Distil BERT-uncased' as models but does not specify any hardware details like GPU models, CPU, or memory used for training or inference.
Software Dependencies No The paper mentions using 'pre-trained Res Net-50' and 'Distil BERT' architectures and frameworks like 'SGD' and 'Adam' for optimization, but it does not provide specific version numbers for any software libraries or dependencies (e.g., Python, PyTorch, TensorFlow, CUDA versions).
Experiment Setup Yes All hyperparameters are selected via cross-validation and are listed in Table 9. Table 9: Hyperparameter settings for the subpopulation shifts. Learning rate, Weight decay, Scheduler, Batch size, Type of mixup, Architecture, Optimizer, Maximum Epoch, Strategy sel. prob. psel. Table 12: Hyperparameter settings for the domain shifts. Learning rate, Weight decay, Scheduler, Batch size, Type of mixup, Architecture, Optimizer, Maximum Epoch, Strategy sel. prob. psel.