Overcoming Simplicity Bias in Deep Networks using a Feature Sieve

Authors: Rishabh Tiwari, Pradeep Shenoy

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We provide concrete evidence of this differential suppression & enhancement of relevant features on both controlled datasets and real-world images, and report substantial gains on many real-world debiasing benchmarks (11.4% relative gain on Imagenet-A; 3.2% on BAR, etc).
Researcher Affiliation Industry Rishabh Tiwari 1 Pradeep Shenoy 1 1Google Research India. Correspondence to: Pradeep Shenoy <shenoypradeep@google.com>.
Pseudocode Yes Algorithm 1 SIFER: Mitigating simplicity bias Input :Pretrained Model Weights W; training data D; training iters N Hparams :Aux Depth AD; Aux Position AP main lr weight α1; aux lr weight α2 aux forget weight α3; forget after iters F Output :robust model weights W for k = 1 . . . N do (x, y) sample(D) ˆy, ˆyaux Forward with aux(x, AD, AP , W) L1 CE(ˆy, y) L2 CE(ˆyaux, y) Lf CE(ˆyaux, U) L α1L1 + α2L2 if k % F == 0 then L L + α3Lf end W Backward(L) W Optimize Step( W) end
Open Source Code Yes Code available at https://github.com/google-research/googleresearch/sifer
Open Datasets Yes CMNIST: Colored-MNIST... CIFAR-MNIST... BAR: Biased Activity Recognition (Nam et al., 2020)... Celeb A: Celeb A (Liu et al., 2018)... NICO: NICO (He et al., 2021)... Image Net-9: Image Net-9 (Xiao et al., 2020)... Image Net-A: Image Net-A (Hendrycks et al., 2021);
Dataset Splits Yes For BAR, since there is no validation data provided, we study it under two settings. In the first, we use 20% of images from the test set and call it OOD-validation. In the second setting, which is harder and more realistic, we use 20% images from the train set, calling it In-Domain (ID) validation. For NICO-Animal, Celeb A Hair and Image Net-9, we use the already supplied validation data. Table 1 shows the percentage of bias-conflicting examples, i.e., examples that violate the spurious feature correlation or training domain, for each portion of each dataset.
Hardware Specification No The paper does not explicitly describe the hardware used for experiments, such as specific GPU or CPU models, memory amounts, or cloud computing instance types.
Software Dependencies No The paper does not provide specific software dependency details with version numbers, such as programming language versions, library versions, or specific deep learning frameworks (e.g., 'optimized using SGD optimizer' does not include version information).
Experiment Setup Yes For all our real-world experiments we consistently used Res Net-18, an auxiliary layer that uses the same layer structure as of Basic Block of Res Net with varying depth, optimized using SGD optimizer with a fixed learning rate of 0.001. Table 7 shows the hyperparamters search space for all the hyperparameters that we tune on the basis of validation set. Table 8 shows the hyperparameter values obtained from the hyperparamter tuning.