Feature Reconstruction From Outputs Can Mitigate Simplicity Bias in Neural Networks

Authors: Sravanti Addepalli, Anshul Nasery, Venkatesh Babu Radhakrishnan, Praneeth Netrapalli, Prateek Jain

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Using this simple solution, we demonstrate up to 15% gains in OOD accuracy on the recently introduced semi-synthetic datasets with extreme distribution shifts. Moreover, we demonstrate noteworthy gains over existing SOTA methods on the standard OOD benchmark Domain Bed as well.
Researcher Affiliation Collaboration Google Research India Indian Institute of Science, Bangalore
Pseudocode Yes Our training algorithm is summarized in Algorithm-1 and the training pipeline is illustrated in Figure 2. Below we provide the python code for FRR-L in the Domain Bed framework.
Open Source Code No The paper includes pseudocode for FRR-L in Section I, but it is embedded directly as text within the paper and not provided as a link to an external repository or explicitly stated as released open-source code for the entire methodology.
Open Datasets Yes To empirically demonstrate feature replication, we use a binarized version of the coloured MNIST dataset (Gulrajani & Lopez-Paz, 2020). We extend the simple binary MNIST-CIFAR dataset proposed by Shah et al. (2020) to a 10-class dataset... We test our approach on the Domain Bed benchmark (Gulrajani & Lopez-Paz, 2020) comprising of five different datasets, each of which have k domains.
Dataset Splits Yes To select the best hyperparameter for both SVM and FRR, we consider the presence of a validation set whose distribution is similar to the test distribution. We use the performance of the model on in-domain validation data (i.e. the in-domain strategy by Gulrajani & Lopez-Paz (2020)) to select the best hyper-parameters, and report the average performance and standard deviation across 5 random seeds.
Hardware Specification Yes All our experiments were done on single V100 GPUs.
Software Dependencies No No specific version numbers for software dependencies (e.g., Python, PyTorch, TensorFlow, or other libraries) are provided in the paper.
Experiment Setup Yes The base training (E1, E2, E3) is done for 500 epochs, and the linear layer training / finetuning (E4 E18) is done for 20 epochs, without any augmentations. The batch size is fixed to 32, and SWAD hyper-parameters are the same as those used by Cha et al. (2021). We train for 3000 (5000 for Domain Net) steps in the FRR-L phase, and 5000 (10000 for Domain Net) in the FRR-FLFT phase. We use random search to select hyperparameters for our algorithm, and use the suggested hyperparameters for the other baselines. ... The range of the hyperparameters is shown in Table 5.