Dynamic Position-aware Network for Fine-grained Image Recognition
Authors: Shijie Wang, Haojie Li, Zhihui Wang, Wanli Ouyang2791-2799
AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments verify that DP-Net yields the best performance under the same settings with most competitive approaches, on CUB Bird, Stanford-Cars, and FGVC Aircraft datasets. |
| Researcher Affiliation | Collaboration | Shijie Wang1, 2, Haojie Li1, 2 , Zhihui Wang1, 2, Wanli Ouyang3 1International School of Information Science & Engineering, Dalian University of Technology, China 2Key Laboratory for Ubiquitous Network and Service Software of Liaoning Province, China 3 The University of Sydney, Sense Time Computer Vision Research Group, Australia |
| Pseudocode | No | The paper includes mathematical formulations and descriptions of modules but does not present any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any specific links or explicit statements about the availability of open-source code for the described methodology. |
| Open Datasets | Yes | We comprehensively evaluate our algorithm on Caltech UCSD Birds (Branson et al. 2014) (CUB-200-2011), Stanford Cars (Krause et al. 2013) (Cars) and FGVC Aircraft (Airs) (Maji et al. 2013) datasets, which are widely used benchmark for fine-grained image recognition. |
| Dataset Splits | Yes | The CUB200-2011 dataset contains 11,788 images spanning 200 subspecies. The ratio of train data and test data is roughly 1:1. The Cars dataset has 16,185 images from 196 classes officially split into 8,144 training and 8,041 test images. The Airs dataset contains 10,000 images over 100 classes, and the train and test sets split ratio is around 2 : 1. |
| Hardware Specification | No | The paper does not explicitly specify any hardware details (e.g., CPU, GPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions "Res Net50 as feature extractor" and "Momentum SGD" but does not provide specific version numbers for these or other software libraries/frameworks. |
| Experiment Setup | Yes | In all our experiments, all images are resized to 448 448, and we crop and resize the patches to 224 224 from the original image. We use fully-convolutional network Res Net50 as feature extractor and apply Batch Normalization as regularizer. We also use Momentum SGD with initial learning rate 0.001 and multiplied by 0.1 after 60 epochs. We use weight decay 1e 4. To reduce patch redundancy, we adopt the non-maximum suppression (NMS) on default patches based on their discriminative scores, and the NMS threshold is set to 0.25. |