Shape-Texture Debiased Neural Network Training

Authors: Yingwei Li, Qihang Yu, Mingxing Tan, Jieru Mei, Peng Tang, Wei Shen, Alan Yuille, cihang xie

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments show that our method successfully improves model performance on several image recognition benchmarks and adversarial robustness. Our ablation shows that such bias degenerates model performance. To reduce the computational overhead in this ablation, all models are trained and evaluated on Image Net-200
Researcher Affiliation Collaboration Yingwei Li1, Qihang Yu1, Mingxing Tan2, Jieru Mei1, Peng Tang1, Wei Shen3 Alan Yuille1 & Cihang Xie4 1Johns Hopkins University 2Google Brain 3Shanghai Jiaotong University 4University of California, Santa Cruz
Pseudocode No The paper includes figures illustrating processes (Figure 2, Figure 5) but does not provide structured pseudocode or algorithm blocks.
Open Source Code Yes The code is available here: https://github.com/Li Yingwei/ Shape Texture Debiased Training.
Open Datasets Yes We evaluate models on Image Net classification and PASCAL VOC semantic segmentation. Image Net dataset (Russakovsky et al., 2015) consists of 1.2 million images for training, and 50,000 for validation, from 1,000 classes. PASCAL VOC 2012 segmentation dataset (Everingham et al., 2012) with extra annotated images from (Hariharan et al., 2011) involves 20 foreground object classes and one background class, including 10,582 training images and 1,449 validation images.
Dataset Splits Yes Image Net dataset (Russakovsky et al., 2015) consists of 1.2 million images for training, and 50,000 for validation, from 1,000 classes. PASCAL VOC 2012 segmentation dataset [...] including 10,582 training images and 1,449 validation images. To reduce the computational overhead in this ablation, all models are trained and evaluated on Image Net-200, which is a 200 classes subset of the original Image Net, including 100,000 images (500 images per class) for training and 10,000 images (50 images per class) for validation.
Hardware Specification No The paper does not specify any hardware details (e.g., CPU, GPU models, or cloud computing specifications) used for running the experiments.
Software Dependencies No The paper states "our implementation is based on the publicly available framework in Py Torch," but it does not provide specific version numbers for PyTorch or any other software dependencies.
Experiment Setup Yes To generate cue conflict images, we follow Geirhos et al. (2019) to use Adaptive Instance Normalization (Huang & Belongie, 2017) in style transfer, and set stylization coefficient α = 0.5. We choose the shape-texture coefficient γ = 0.8 when assigning labels. We set the maximum perturbation change per pixel ϵ = 16/255 for FGSM. When training shape-biased, texture-biased and our shape-texture debiased models, we always apply the auxiliary batch normalization (BN) design.