Pareto Invariant Risk Minimization: Towards Mitigating the Optimization Dilemma in Out-of-Distribution Generalization

Authors: Yongqiang Chen, Kaiwen Zhou, Yatao Bian, Binghui Xie, Bingzhe Wu, Yonggang Zhang, MA KAILI, Han Yang, Peilin Zhao, Bo Han, James Cheng

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on challenging benchmarks, WILDS, show that PAIR alleviates the compromises and yields top OOD performances.
Researcher Affiliation Collaboration Yongqiang Chen1 , Kaiwen Zhou1, Yatao Bian2, Binghui Xie1, Bingzhe Wu2 1The Chinese University of Hong Kong 2Tencent AI Lab 3Hong Kong Baptist University {yqchen,kwzhou,bhxie21,hyang,klma,jcheng}@cse.cuhk.edu.hk Yonggang Zhang3, Han Yang1, Kaili Ma1, Peilin Zhao2, Bo Han3, James Cheng1 {yatao.bian,wubingzheagent}@gmail.com masonzhao@tencent.com {csygzhang,bhanml}@comp.hkbu.edu.hk
Pseudocode Yes Algorithm 1 Pseudo code for PAIR-o.
Open Source Code Yes 1Code is available at https://github.com/LFhase/PAIR.
Open Datasets Yes We select 6 challenging datasets from WILDS (Koh et al., 2021) benchmark for evaluating PAIR-o performance in realistic distribution shifts. The datasets cover from domain distribution shifts, subpopulation shifts and the their mixed. A summary of the basic information and statistics of the WILDS datasets can be found in Table. 8, Table. 9, respectively.
Dataset Splits Yes By default, we repeat the experiments by 3 runs with the random seeds of 0, 1, 2. While for Camelyon17, we follow the official guide to repeat 10 times with the random seeds from 0 to 9, and for Poverty Map, we repeat the experiments 5 times with the random seeds from 0 to 4. Specifically, to construct the validation set, the data from each domain will be first splitted into 80% (for training and evaluation) and 20% (for validation and model selection).
Hardware Specification Yes Specifically, we run COLOREDMNIST experiments on Linux Servers with NVIDIA RTX 3090Ti graphics cards with CUDA 11.3, 40 cores Intel(R) Xeon(R) Silver 4114 CPU @ 2.20GHz, 256 GB Memory, and Ubuntu 18.04 LTS installed. While for WILDS and DOMAINBED experiments, we run on Linux servers with NVIDIA V100 graphics cards with CUDA 10.2.
Software Dependencies Yes We implement our methods with Py Torch (Paszke et al., 2019). For the software and hardware configurations, we ensure the consistent environments for each datasets. Specifically, we run COLOREDMNIST experiments on Linux Servers with NVIDIA RTX 3090Ti graphics cards with CUDA 11.3... While for WILDS and DOMAINBED experiments, we run on Linux servers with NVIDIA V100 graphics cards with CUDA 10.2.
Experiment Setup Yes The general hyperparemter setting inherit from the referred codes and papers, and are shown as in Table 11. Table 11: General hyperparameter settings for the experiments on WILDS. (Includes Learning rate, Weight decay, Batch size, Optimizer, Pretraing Step, Maximum Epoch)