Learning to Parameterize Visual Attributes for Open-set Fine-grained Retrieval
Authors: Shijie Wang, Jianlong Chang, Haojie Li, Zhihui Wang, Wanli Ouyang, Qi Tian
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on open-set fine-grained retrieval datasets validate the superior performance of our VAPNet over existing solutions. Extensive experiments show that open-set fine-grained retrieval task can benefit from the proposed method, and thus our VAPNet obtains significant gains of 8.6% average accuracy over recent state-of-the-art work [33] on three open-set fine-grained retrieval benchmarks. |
| Researcher Affiliation | Collaboration | 1International School of Information Science & Engineering, Dalian University of Technology, China 2 Huawei Cloud & AI, China 3College of Computer and Engineering, Shandong University of Science and Technology, China 4Shanghai Artificial Intelligence Laboratory, China |
| Pseudocode | No | The paper describes its method using text and figures but does not include any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not include an unambiguous statement about releasing source code or provide a link to a code repository. |
| Open Datasets | Yes | Datasets. CUB-200-2011 dataset [5] contains 200 bird subcategories with 11,788 images. The Stanford Cars dataset [16] contains 196 car models of 16,185 images. FGVC Aircraft dataset [23] is divided into first 50 classes (5,000 images) for training and the rest 50 classes (5,000 images) for testing. In Shop Clothes Retrieval (In-Shop) [21] contains 7,982 subcategories with 52, 712 images... |
| Dataset Splits | Yes | CUB-200-2011 dataset [5] contains 200 bird subcategories with 11,788 images. We utilize the first 100 classes (5,864 images) in training and the rest (5,924 images) in testing. The Stanford Cars dataset [16] ... is also similar to CUB, which is split into the first 98 classes (8,054 images) for training and the remaining classes (8,131 images) for testing. FGVC Aircraft dataset [23] is divided into first 50 classes (5,000 images) for training and the rest 50 classes (5,000 images) for testing. |
| Hardware Specification | Yes | Our model is trained end-to-end on one NVIDIA 2080Ti GPUs for acceleration. |
| Software Dependencies | No | The paper mentions using a Resnet-50 backbone and SGD optimizer, but does not specify software dependencies with version numbers (e.g., 'Python 3.8, PyTorch 1.9, and CUDA 11.1'). |
| Experiment Setup | Yes | We train our models using Stochastic Gradient Descent (SGD) optimizer with weight decay of 0.0001, momentum of 0.9, and batch size of 32. The initial learning rate is set to 10^-5, with exponential decay of 0.9 after every 5 epochs. The total number of training epochs is set to 200. |