SimAM: A Simple, Parameter-Free Attention Module for Convolutional Neural Networks
Authors: Lingxiao Yang, Ru-Yuan Zhang, Lida Li, Xiaohua Xie
ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we conduct a series of experiments across a wide range of tasks to verify the effectiveness of our Sim AM. Quantitative evaluations on various visual tasks demonstrate that the proposed module is flexible and effective to improve the representation ability of many Conv Nets. |
| Researcher Affiliation | Academia | 1School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, China 2Guangdong Province Key Laboratory of Information Security Technology, Sun Yat-sen University, Guangzhou, China 3Key Laboratory of Machine Intelligence and Advanced Computing, Ministry of Education, Sun Yat-sen University, Guangzhou, China 4Institute of Psychology and Behavioral Science, Shanghai Jiao Tong University, Shanghai, China 5Shanghai Key Laboratory of Psychotic Disorders, Shanghai Mental Health Center, Shanghai Jiao Tong University, Shanghai, China 6The Hong Kong Polytechnic University, Hong Kong, China. |
| Pseudocode | Yes | Figure 3. A pytorch-like implementation of our Sim AM. |
| Open Source Code | Yes | Our code is available at Pytorch-Sim AM. |
| Open Datasets | Yes | To begin, we test our methods on image classification tasks based on CIFAR (Krizhevsky et al., 2009). There are two variants: one has 10 categories and the other one contains 100 classes. Both variants have 50k training and 10k validation images. In this section, we evaluate our Sim AM on Image Net (Russakovsky et al., 2015) that consists of 1000 classes. All models are trained on 1.2M training images and tested on 50K validation images with the standard setup. |
| Dataset Splits | Yes | There are two variants: one has 10 categories and the other one contains 100 classes. Both variants have 50k training and 10k validation images. All models are trained on 1.2M training images and tested on 50K validation images with the standard setup. |
| Hardware Specification | Yes | All Res Nets are optimized by SGD with a batch size of 256 on 4 GPUs (Quadro RTX 8000). |
| Software Dependencies | No | The paper mentions using "pytorch" for implementation (Figure 3) and "mmdetection (Chen et al., 2019)" for object detection, but it does not specify version numbers for these or other software libraries (e.g., PyTorch 1.x, CUDA 11.x). |
| Experiment Setup | Yes | We follow the standard training pipeline (Lee et al., 2015; He et al., 2016b) for all models. Specifically, each image is zero-padded with 4 pixels on each side, and a 32 × 32 image fed for training is randomly cropped from that padded image or its horizontal flip. Optimization is done by a SGD solver with a momentum of 0.9, a batch size of 128, and a weight decay of 0.0005. The learning rate is started with 0.1 and divided by 10 at 32,000 and 48,000 iterations (the division is stopped at 64,000 iterations). For our Sim AM, the hyper-parameter λ in Eqn (5) is set to 0.0001, searched using Res Net-20 on a 45k/5k train/split set. |