Improving Out-of-distribution Generalization with Indirection Representations

Authors: Kha Pham, Hung Le, Man Ngo, Truyen Tran

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate that In Lay is consistently effective in improving out-of-distribution generalization throughout a comprehensive suite of experiments, including IQ problems, distorted image classification, and few-shot domain adaptation NLP classification.
Researcher Affiliation Academia 1 Applied Artificial Intelligence Institute, Deakin University 2 Faculty of Mathematics and Computer Science, VNUHCM-University of Science
Pseudocode No The paper describes the operations of the Indirection Layer verbally and mathematically but does not include structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide an explicit statement about releasing source code or a link to a code repository for the described method.
Open Datasets Yes We use a vanilla 6-layer Vision Transformer (Vi T) (Dosovitskiy et al., 2020) as the base model and test it... on different datasets, namely the SVHN (Netzer et al., 2011) and CIFAR10&100 (Krizhevsky, 2009). Gao et al. (2019) proposed the Few Rel 2.0 dataset...
Dataset Splits No The paper mentions training and testing sets, and reports validation results for one task ("Since the test set is not provided for the public, we only report the test results on validation set"). However, it does not consistently provide specific details (percentages, counts) for validation splits across all experiments, which is needed to reproduce them.
Hardware Specification Yes Models are trained on a single Tesla V100-SXM2 GPU.
Software Dependencies No The paper mentions software components like "Adam optimizer", "Vision Transformer (Vi T)", and "BERT encoder", along with their respective foundational papers, but does not specify their version numbers (e.g., PyTorch 1.9, TensorFlow 2.x).
Experiment Setup Yes We use Adam optimizer (Kingma and Ba, 2014) with learning rates ranging from 10 5 to 3 10 4, depending on specific model and transformation. We train all models with batch size 32 in 200 epochs.