reproducibilityindex.ai

Information Bottleneck Approach to Spatial Attention Learning

Authors: Qiuxia Lai, Yu Li, Ailing Zeng, Minhao Liu, Hanqiu Sun, Qiang Xu

IJCAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments show that the proposed IB-inspired spatial attention mechanism can yield attention maps that neatly highlight the regions of interest while suppressing backgrounds, and bootstrap standard DNN structures for visual recognition tasks (e.g., image classiﬁcation, ﬁnegrained recognition, cross-domain classiﬁcation). The attention maps are interpretable for the decision making of the DNNs as veriﬁed in the experiments.
Researcher Affiliation	Academia	Qiuxia Lai1 , Yu Li1 , Ailing Zeng1 , Minhao Liu1 , Hanqiu Sun2 and Qiang Xu1 1The Chinese University of Hong Kong 2University of Electronic Science and Technology of China
Pseudocode	No	The paper includes figures illustrating the framework (Fig. 1, Fig. 2) and mathematical equations, but no explicit pseudocode or algorithm blocks.
Open Source Code	Yes	Our code is available at this https URL.
Open Datasets	Yes	CIFAR-10 [Krizhevsky et al., 2009] contains 60, 000 32 32 natural images of 10 classes, which are splited into 50, 000 training and 10, 000 test images. CIFAR-100 [Krizhevsky et al., 2009] is similar to CIFAR-10, except that it has 100 classes. CUB-200-2011 (CUB) [Wah et al., 2011] contains 5, 994 training and 5, 794 testing bird images from 200 classes. SVHN collects 73, 257 training, 26, 032 testing, and 531, 131 extra digit images from house numbers in street view images. STL-10 contains 5, 000 training and 8, 000 test images of resolution 96 96 organized into 10 classes.
Dataset Splits	Yes	CIFAR-10 [Krizhevsky et al., 2009] contains 60, 000 32 32 natural images of 10 classes, which are splited into 50, 000 training and 10, 000 test images. CIFAR-100 [Krizhevsky et al., 2009] is similar to CIFAR-10, except that it has 100 classes.
Hardware Specification	No	The paper does not specify any hardware details such as GPU models, CPU types, or memory used for running the experiments.
Software Dependencies	No	The paper does not provide specific software dependency details with version numbers (e.g., Python, PyTorch versions).
Experiment Setup	Yes	We set β = 0.01, λg =0.4, and λc =0.1 empirically. We experiment on K = 64, 128, 256, 512, 1024. As shown in Fig. 4 (b), K =256 achieves the best performance. Fig. 4 (c) shows the classiﬁcation accuracy when varying number of anchor values Q, where Q between 20 and 50 gives better performance. We use original input images after data augmentation (random ﬂipping and cropping with a padding of 4 pixels).