One-Pixel Shortcut: On the Learning Preference of Deep Neural Networks

Authors: Shutong Wu, Sizhe Chen, Cihang Xie, Xiaolin Huang

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate OPS and its counterparts in 6 architectures, 6 model sizes, 8 training strategies on CIFAR-10 (Krizhevsky et al., 2009) and Image Net (Russakovsky et al., 2015) subset, and find that OPS is always superior in degrading model s testing accuracy than EM ULEs.
Researcher Affiliation Academia Shutong Wu 1, Sizhe Chen 1, Cihang Xie2 & Xiaolin Huang 1 1Department of Automation, Shanghai Jiao Tong University 2Computer Science and Engineering, University of California, Santa Cruz
Pseudocode Yes Algorithm 1 Model-Free Searching for One-Pixel Shortcut
Open Source Code Yes code available at https://github.com/cychomatica/One-Pixel-Shotcut.
Open Datasets Yes We evaluate OPS and its counterparts in 6 architectures, 6 model sizes, 8 training strategies on CIFAR-10 (Krizhevsky et al., 2009) and Image Net (Russakovsky et al., 2015) subset
Dataset Splits Yes Table 1: The testing accuracy of Res Net-18 models trained on unshuffled and shuffled data. ... We train different convolutional networks and vision transformers on the One-Pixel Shortcut CIFAR-10 training set, and evaluate their performance on the unmodified CIFAR-10 test set.
Hardware Specification Yes Our experiments are implemented on CIFAR-10 and Image Net subset, using 4 NVIDIA RTX 2080Ti GPUs.
Software Dependencies No The paper mentions optimizers like SGD and Adam W but does not specify software versions for libraries like PyTorch, TensorFlow, or specific Python versions.
Experiment Setup Yes For all the convolutional networks, we use an SGD optimizer with a learning rate set to 0.1, momentum set to 0.9, and weight decay set to 5e 4. For all the compact vision transformers, we use Adam W optimizer with β1 = 0.9, β2 = 0.999, learning rate set to 5e 4, and weight decay set to 3e 2. Batch size is set to 128 for all the models except Wide Res Net-28-10, where it is set to 64. ... All the models are trained for 200 epochs with a multi-step learning rate schedule, and the training accuracy of each model is guaranteed to reach near 100%.