Position-Aware Recalibration Module: Learning From Feature Semantics and Feature Position

Authors: Xu Ma, Song Fu

IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results on Image Net and MS COCO benchmarks show that our approach surpasses related methods by a clear margin with less computational overhead. For example, we improve the Res Net50 by absolute 1.75% (77.65% vs. 75.90%) on Image Net 2012 validation dataset, and 1.5% 1.9% m AP on MS COCO validation dataset with almost no computational overhead. We evaluate our module on multiple vision tasks including image recognition and object detection. Compared with plug-in modules in [Hu et al., 2018b; Hu et al., 2018a; Woo et al., 2018; Cao et al., 2019; Li et al., 2019a; Li et al., 2019b], we achieve results either on par or better with fewer parameters and FLOPs. Meanwhile, when applying our method to object detection, our PRM consistently achieves at least 1.5% 1.9% absolute m AP improvement on the MS COCO benchmark. Comprehensive ablation studies are conducted to provide an inherent insight into our method.
Researcher Affiliation Academia Xu Ma , Song Fu Department of Computer Science and Engineering University of North Texas Denton, Texas 76203, USA xuma@my.unt.edu, Song.Fu@unt.edu
Pseudocode No The paper describes its method using mathematical formulations and a diagram (Figure 1) but does not include a dedicated pseudocode block or algorithm steps.
Open Source Code Yes Codes are made publicly available1. 1https://github.com/13952522076/PRM
Open Datasets Yes We first evaluate our module for Image recognition task on Image Net 2012 classification dataset [Russakovsky et al., 2015]. We further investigate PRM for object detection on the MS COCO benchmark.
Dataset Splits Yes We train all models on the training set using Py Torch [Paszke et al., 2019] framework, and report the top-1 and top5 classification accuracy on the validation set. ... For example, we improve the Res Net50 by absolute 1.75% (77.65% vs. 75.90%) on Image Net 2012 validation dataset, and 1.5% 1.9% m AP on MS COCO validation dataset with almost no computational overhead.
Hardware Specification Yes All models are conducted on a server with 8 Tesla V100 GPUs, and each GPU has 32 images in a mini-batch (256 in total).
Software Dependencies No The paper mentions using "Py Torch [Paszke et al., 2019] framework" but does not specify its version or any other software dependencies with version numbers.
Experiment Setup Yes For training, we follow the standard practice that randomly crops the images to a spatial resolution of 224 224 and horizontal flip images by a possibility of 50%. The input images are normalized by mean channel subtraction for both training and testing. We train all models from scratch using synchronous SGD with momentum 0.9 and weight decay 0.0001 for 100 epochs. All models are conducted on a server with 8 Tesla V100 GPUs, and each GPU has 32 images in a mini-batch (256 in total). The learning rate is initialized to 0.1 and divided by a factor of 10 every 30 epochs.