reproducibilityindex.ai

Recurrent Attentional Reinforcement Learning for Multi-Label Image Recognition

Authors: Tianshui Chen, Zhouxia Wang, Guanbin Li, Liang Lin

AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments and comparisons on two large-scale benchmarks (i.e., PASCAL VOC and MSCOCO) show that our model achieves superior performance over existing state-of-the-art methods in both performance and efﬁciency as well as explicitly identifying image-level semantic labels to speciﬁc object regions.
Researcher Affiliation	Collaboration	Tianshui Chen,1 Zhouxia Wang,1,2 Guanbin Li,1 Liang Lin1,2 1School of Data and Computer Science, Sun Yat-sen University, Guangzhou, China 2Sense Time Group Limited
Pseudocode	No	The paper describes the algorithms and processes using mathematical formulas and text, but it does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code	No	The paper does not contain any explicit statements about releasing source code or links to a code repository.
Open Datasets	Yes	Extensive experiments and comparisons on two large-scale benchmarks (i.e., PASCAL VOC and MSCOCO) show that our model achieves superior performance... Pascal VOC 2007 (VOC07) (Everingham et al. 2010) and Microsoft COCO (MS-COCO) (Lin et al. 2014).
Dataset Splits	Yes	The VOC07 dataset contains 9,963 images of 20 object categories, and it is divided into trainval and test sets... The MS-COCO dataset is originally built for object detection and has also been used for multi-label recognition recently. It is a larger and more challenging dataset, which comprises a training set of 82,081 images and a validation set of 40,137 images from 80 object categories.
Hardware Specification	Yes	We test our model on a desktop with a single NVIDIA Ge Force GTX TITAN-X GPU.
Software Dependencies	No	The paper mentions using VGG16 Conv Net and Adam solver, but it does not specify any software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup	Yes	During training, all the images are resized to N N, and randomly cropped with a size of (N 64) (N 64), followed by a randomly horizontal ﬂipping, for data augmentation. In our experiments, we train two models with N = 512 and N = 640, respectively. For the anchor strategy, we set 3 region scales with area 80 80, 160 160, 320 320 for N = 512 and 100 100, 200 200, 400 400 for N = 640, and 3 aspect ratios of 2:1, 1:1, 1:2 for both scales. Thus, k is set as 9. Both of the models are optimized using the Adam solver with a batch size of 16, an initial learning rate of 0.00001, momentums of 0.9 and 0.999.