reproducibilityindex.ai

Channel Interaction Networks for Fine-Grained Image Categorization

Authors: Yu Gao, Xintong Han, Xun Wang, Weilin Huang, Matthew Scott10818-10825

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Finally, comprehensive experiments are conducted on three publicly available benchmarks, where the proposed method consistently outperforms the state-of-the-art approaches, such as DFL-CNN(Wang, Morariu, and Davis 2018) and NTS(Yang et al. 2018).
Researcher Affiliation	Industry	Yu Gao, Xintong Han, Xun Wang, Weilin Huang, Matthew R. Scott Malong Technologies, Shenzhen, China Shenzhen Malong Artiﬁcial Intelligence Research Center, Shenzhen, China {chrgao, xinhan, xunwang, whuang, mscott}@malong.com
Pseudocode	No	No structured pseudocode or algorithm blocks were found.
Open Source Code	No	The paper does not provide any statement or link indicating that the source code for the methodology is openly available.
Open Datasets	Yes	We employ three publicly available datasets in our experiments: (1) CUB-200-2011 (Wah et al. 2011) with 11,788 images from 200 wild bird species, (2) Stanford Cars (Krause et al. 2013) including 16,185 images over 196 classes, and (3) FGVC Aircraft (Maji et al. 2013) containing 196 classes about 10,000 images.
Dataset Splits	No	The paper uses publicly available datasets (CUB-200-2011, Stanford Cars, FGVC Aircraft) but does not explicitly provide the training/validation/test splits (e.g., percentages or exact counts for each split).
Hardware Specification	Yes	We report our inference time on a Nvidia TITAN XP GPU with Py Torch implementation.
Software Dependencies	No	The paper mentions 'Py Torch' but does not specify its version number or any other software dependencies with version details.
Experiment Setup	Yes	The input image size is 448 448 as most state-of-the-art fine-grained categorization approaches. By following that of NTS (Yang et al. 2018), we implement data augmentation including random cropping and horizontal flipping during training. Only center cropping is involved in inference. The model is trained for 100 epochs with SGD for all datasets, and the base learning rate is set to 0.001, which annealed by 0.5 every 20 epochs. we use a batch size of 20 and ensure that each batch contains 4 categories with 5 images in each category. And then, we randomly split these 20 images into 10 image pairs. We have tried to use all the O(n2) pairs or apply hard negative mining, which hurt the performance and consume more memory. The weight decay is set to 2 10 4. β in Equation 8 is set to 0.5 empirically. and α in Equation 9 is set to 2.0. Top-1 accuracy is used as the evaluation metric. We use Py Torch to implement our method.