Learning to Parameterize Visual Attributes for Open-set Fine-grained Retrieval

Authors: Shijie Wang, Jianlong Chang, Haojie Li, Zhihui Wang, Wanli Ouyang, Qi Tian

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on open-set fine-grained retrieval datasets validate the superior performance of our VAPNet over existing solutions. Extensive experiments show that open-set fine-grained retrieval task can benefit from the proposed method, and thus our VAPNet obtains significant gains of 8.6% average accuracy over recent state-of-the-art work [33] on three open-set fine-grained retrieval benchmarks.
Researcher Affiliation Collaboration 1International School of Information Science & Engineering, Dalian University of Technology, China 2 Huawei Cloud & AI, China 3College of Computer and Engineering, Shandong University of Science and Technology, China 4Shanghai Artificial Intelligence Laboratory, China
Pseudocode No The paper describes its method using text and figures but does not include any structured pseudocode or algorithm blocks.
Open Source Code No The paper does not include an unambiguous statement about releasing source code or provide a link to a code repository.
Open Datasets Yes Datasets. CUB-200-2011 dataset [5] contains 200 bird subcategories with 11,788 images. The Stanford Cars dataset [16] contains 196 car models of 16,185 images. FGVC Aircraft dataset [23] is divided into first 50 classes (5,000 images) for training and the rest 50 classes (5,000 images) for testing. In Shop Clothes Retrieval (In-Shop) [21] contains 7,982 subcategories with 52, 712 images...
Dataset Splits Yes CUB-200-2011 dataset [5] contains 200 bird subcategories with 11,788 images. We utilize the first 100 classes (5,864 images) in training and the rest (5,924 images) in testing. The Stanford Cars dataset [16] ... is also similar to CUB, which is split into the first 98 classes (8,054 images) for training and the remaining classes (8,131 images) for testing. FGVC Aircraft dataset [23] is divided into first 50 classes (5,000 images) for training and the rest 50 classes (5,000 images) for testing.
Hardware Specification Yes Our model is trained end-to-end on one NVIDIA 2080Ti GPUs for acceleration.
Software Dependencies No The paper mentions using a Resnet-50 backbone and SGD optimizer, but does not specify software dependencies with version numbers (e.g., 'Python 3.8, PyTorch 1.9, and CUDA 11.1').
Experiment Setup Yes We train our models using Stochastic Gradient Descent (SGD) optimizer with weight decay of 0.0001, momentum of 0.9, and batch size of 32. The initial learning rate is set to 10^-5, with exponential decay of 0.9 after every 5 epochs. The total number of training epochs is set to 200.