One-Pixel Shortcut: On the Learning Preference of Deep Neural Networks
Authors: Shutong Wu, Sizhe Chen, Cihang Xie, Xiaolin Huang
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate OPS and its counterparts in 6 architectures, 6 model sizes, 8 training strategies on CIFAR-10 (Krizhevsky et al., 2009) and Image Net (Russakovsky et al., 2015) subset, and find that OPS is always superior in degrading model s testing accuracy than EM ULEs. |
| Researcher Affiliation | Academia | Shutong Wu 1, Sizhe Chen 1, Cihang Xie2 & Xiaolin Huang 1 1Department of Automation, Shanghai Jiao Tong University 2Computer Science and Engineering, University of California, Santa Cruz |
| Pseudocode | Yes | Algorithm 1 Model-Free Searching for One-Pixel Shortcut |
| Open Source Code | Yes | code available at https://github.com/cychomatica/One-Pixel-Shotcut. |
| Open Datasets | Yes | We evaluate OPS and its counterparts in 6 architectures, 6 model sizes, 8 training strategies on CIFAR-10 (Krizhevsky et al., 2009) and Image Net (Russakovsky et al., 2015) subset |
| Dataset Splits | Yes | Table 1: The testing accuracy of Res Net-18 models trained on unshuffled and shuffled data. ... We train different convolutional networks and vision transformers on the One-Pixel Shortcut CIFAR-10 training set, and evaluate their performance on the unmodified CIFAR-10 test set. |
| Hardware Specification | Yes | Our experiments are implemented on CIFAR-10 and Image Net subset, using 4 NVIDIA RTX 2080Ti GPUs. |
| Software Dependencies | No | The paper mentions optimizers like SGD and Adam W but does not specify software versions for libraries like PyTorch, TensorFlow, or specific Python versions. |
| Experiment Setup | Yes | For all the convolutional networks, we use an SGD optimizer with a learning rate set to 0.1, momentum set to 0.9, and weight decay set to 5e 4. For all the compact vision transformers, we use Adam W optimizer with β1 = 0.9, β2 = 0.999, learning rate set to 5e 4, and weight decay set to 3e 2. Batch size is set to 128 for all the models except Wide Res Net-28-10, where it is set to 64. ... All the models are trained for 200 epochs with a multi-step learning rate schedule, and the training accuracy of each model is guaranteed to reach near 100%. |