Invariance Learning based on Label Hierarchy

Authors: Shoji Toyota, Kenji Fukumizu

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The effectiveness of the proposed framework, including the cross-validation, is demonstrated empirically. Theoretical analysis reveals that our framework can estimate the desirable invariant predictor with a hyperparameter fixed correctly, and that such a preferable hyperparameter is chosen by the proposed CV methods under some conditions. We study the effectiveness of the proposed framework and CVs through experiments, comparing them with several existing methods: empirical risk minimization (ERM), fine-tuning methods, and deep domain adaptation strategies.
Researcher Affiliation Academia Shoji Toyota The Graduate University for Advanced Studies Tokyo 190-8562, Japan shoji@ism.ac.jp Kenji Fukumizu The Institute of Statistical Mathematics The Graduate University for Advanced Studies Tokyo 190-8562, Japan fukumizu@ism.ac.jp
Pseudocode Yes Algorithm 1 CV methods. If CORRECTION = True, λ is selected by method II and if False, I.
Open Source Code Yes The code is available in Supplementary Material.
Open Datasets Yes Colored MNIST We apply our framework to Colored MNIST [10] with Y = [10] and Z := [2]. Image Net To see the performance of the proposed methods for more practical data, they are applied to the Image Net [53] with its label reannotated imitating BREEDS [52]. [53] J. Deng, A. Berg, S. Satheesh, H. Su, A. Khosla, L. Fei-Fei. Imagenet: A large-scale hierarchical image database. In CVPR, 2009.
Dataset Splits Yes We propose two methods of cross-validation (CV) for hyperparameter selection in our new IL framework. Algorithm 1 CV methods. If CORRECTION = True, λ is selected by method II and if False, I. Require: : Split De , De1 ad, ..., Den ad into K parts.
Hardware Specification Yes All experiments were performed on a machine with NVIDIA Tesla V100 GPU.
Software Dependencies Yes All our code is written using PyTorch 1.10.1.
Experiment Setup Yes Setting the maximum epoch 500 and λbefore := 1.0, we select (t, λafter) from 4 7 candidates with t {0, 100, 200, 300}, λafter {100, 101, ..., 106} by each of the CVs. We use Adam [37] with β1 = 0.5, β2 = 0.9, and learning rate 0.0001, and decay learning rate by a factor of 0.1 at 80% of total epochs.