A$^2$-Net: Learning Attribute-Aware Hash Codes for Large-Scale Fine-Grained Image Retrieval

Authors: Xiu-Shen Wei, Yang Shen, Xuhao Sun, Han-Jia Ye, Jian Yang

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Qualitative experiments on five benchmark fine-grained datasets show our superiority over competing methods. More importantly, quantitative results demonstrate the obtained hash codes can strongly correspond to certain kinds of crucial properties of fine-grained objects.
Researcher Affiliation Academia 1Nanjing University of Science and Technology 2State Key Lab. for Novel Software Technology, Nanjing University
Pseudocode No The paper describes the methodology in prose and mathematical equations, but it does not contain any structured pseudocode or algorithm blocks.
Open Source Code No The paper does not include any explicit statement about providing open-source code for the described methodology, nor does it provide a link to a code repository.
Open Datasets Yes By following Exch Net [8], our experiments are conducted on five fine-grained benchmark datasets, i.e., CUB200-2011 [37], Aircraft [28], Food101 [2], NABirds [35] and Veg Fru [14]. CUB200-2011 ... is officially split into 5,994 images for training and 5,794 images for test.
Dataset Splits Yes Aircraft contains 10,000 images spanning 100 aircraft models with 3,334 for training, 3,333 for validation and 3,333 for test. For large-scale datasets, Food101 contains 101 kinds of foods with 101,000 images, where for each class, 250 test images are checked manually for correctness while 750 training images still contain a certain amount of noises. NABirds is a high quality dataset which has 48,562 images of North American birds with 555 sub-categories, where 23,929 for training with 24,633 for test. Veg Fru is another large-scale fine-grained dataset covering 200 kinds of vegetables and 92 kinds of fruits with 29,200 for training, 14,600 for validation and 116,931 for test.
Hardware Specification Yes All experiments are conducted with a Ge Force RTX 2080 Ti GPU.
Software Dependencies No The paper mentions using Res Net-50 as the backbone model and mini-batch stochastic gradient descent for optimization, but it does not specify any software libraries or their version numbers (e.g., PyTorch, TensorFlow, scikit-learn versions).
Experiment Setup Yes The total number of training epochs is 20, and the number of batch size is set as 16. Specifically, for these datasets containing less than 20,000 training images, the iteration time Tmax is 60, and the learning rate is divided by 10 at the 50th iteration. For other datasets, Tmax is set as 70, and the learning rate is divided by 10 at the 60th iteration. The hyper-parameters, i.e., λ, α and β in Eq. (13), are set as 1, 1 n k and 12 k , respectively. The optimizer is standard mini-batch stochastic gradient descent with the weight decay as 10 4.