Invariant Information Bottleneck for Domain Generalization

Authors: Bo Li, Yifei Shen, Yezhen Wang, Wenzhen Zhu, Colorado Reed, Dongsheng Li, Kurt Keutzer, Han Zhao7399-7407

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirically, we analyze IIB s performance with extensive experiments on both synthetic and large-scale benchmarks. We show that IIB is able to eliminate the spurious information better than other existing DG methods, and achieves consistent improvements on 7 datasets by 0.7% on Domain Bed (Gulrajani and Lopez-Paz 2020).
Researcher Affiliation Collaboration 1 Microsoft Research Asia, China 2 Hong Kong University of Science and Technology, China 3 Washington University in St. Louis, USA 4 University of California, Berkeley, USA 5 University of Illinois at Urbana-Champaign, USA
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code Yes The code implementation of IIB is released at Github.1 1https://github.com/Luodian/IIB/tree/IIB
Open Datasets Yes CS-CMNIST (Ahuja et al. 2021) CS-CMNIST is a tenway classification task. The images are all drawn from MNIST. ... Geometric Skew CIFAR10 (Nagarajan, Andreassen, and Neyshabur 2021) ... conduct experiments on Domain Bed (Gulrajani and Lopez-Paz 2020) with 7 different datasets of different sizes.
Dataset Splits Yes We split 20% from train set as validation set. ... During training, the validation set is a subset of training set, we choose the model that performs best on the overall validation set for each domain.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments. It only mentions network architectures and training iterations.
Software Dependencies No The paper does not provide specific software dependency details, such as library names with version numbers (e.g., PyTorch version, Python version).
Experiment Setup Yes For IIB specific hyper-parameters, we set λ [1,102], and β [10 3,10 4]. For backbone feature extractor, in Rotated/Colored-MNIST, we use 4-layers 3x3 Conv Net. For VLCS and PACS, we use Res Net-18 (He et al. 2016). For larger datasets, we opt to Res Net-50. For classifier, we both test linear and non-linear invariant (environment) classifiers. ... the network is trained for 5000 iterations with batch size set to 128.