Shape-Texture Debiased Neural Network Training
Authors: Yingwei Li, Qihang Yu, Mingxing Tan, Jieru Mei, Peng Tang, Wei Shen, Alan Yuille, cihang xie
ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments show that our method successfully improves model performance on several image recognition benchmarks and adversarial robustness. Our ablation shows that such bias degenerates model performance. To reduce the computational overhead in this ablation, all models are trained and evaluated on Image Net-200 |
| Researcher Affiliation | Collaboration | Yingwei Li1, Qihang Yu1, Mingxing Tan2, Jieru Mei1, Peng Tang1, Wei Shen3 Alan Yuille1 & Cihang Xie4 1Johns Hopkins University 2Google Brain 3Shanghai Jiaotong University 4University of California, Santa Cruz |
| Pseudocode | No | The paper includes figures illustrating processes (Figure 2, Figure 5) but does not provide structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | The code is available here: https://github.com/Li Yingwei/ Shape Texture Debiased Training. |
| Open Datasets | Yes | We evaluate models on Image Net classification and PASCAL VOC semantic segmentation. Image Net dataset (Russakovsky et al., 2015) consists of 1.2 million images for training, and 50,000 for validation, from 1,000 classes. PASCAL VOC 2012 segmentation dataset (Everingham et al., 2012) with extra annotated images from (Hariharan et al., 2011) involves 20 foreground object classes and one background class, including 10,582 training images and 1,449 validation images. |
| Dataset Splits | Yes | Image Net dataset (Russakovsky et al., 2015) consists of 1.2 million images for training, and 50,000 for validation, from 1,000 classes. PASCAL VOC 2012 segmentation dataset [...] including 10,582 training images and 1,449 validation images. To reduce the computational overhead in this ablation, all models are trained and evaluated on Image Net-200, which is a 200 classes subset of the original Image Net, including 100,000 images (500 images per class) for training and 10,000 images (50 images per class) for validation. |
| Hardware Specification | No | The paper does not specify any hardware details (e.g., CPU, GPU models, or cloud computing specifications) used for running the experiments. |
| Software Dependencies | No | The paper states "our implementation is based on the publicly available framework in Py Torch," but it does not provide specific version numbers for PyTorch or any other software dependencies. |
| Experiment Setup | Yes | To generate cue conflict images, we follow Geirhos et al. (2019) to use Adaptive Instance Normalization (Huang & Belongie, 2017) in style transfer, and set stylization coefficient α = 0.5. We choose the shape-texture coefficient γ = 0.8 when assigning labels. We set the maximum perturbation change per pixel ϵ = 16/255 for FGSM. When training shape-biased, texture-biased and our shape-texture debiased models, we always apply the auxiliary batch normalization (BN) design. |