Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Learning to Parameterize Visual Attributes for Open-set Fine-grained Retrieval

Authors: Shijie Wang, Jianlong Chang, Haojie Li, Zhihui Wang, Wanli Ouyang, Qi Tian

NeurIPS 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on open-set fine-grained retrieval datasets validate the superior performance of our VAPNet over existing solutions. Extensive experiments show that open-set fine-grained retrieval task can benefit from the proposed method, and thus our VAPNet obtains significant gains of 8.6% average accuracy over recent state-of-the-art work [33] on three open-set fine-grained retrieval benchmarks.
Researcher Affiliation	Collaboration	1International School of Information Science & Engineering, Dalian University of Technology, China 2 Huawei Cloud & AI, China 3College of Computer and Engineering, Shandong University of Science and Technology, China 4Shanghai Artificial Intelligence Laboratory, China
Pseudocode	No	The paper describes its method using text and figures but does not include any structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not include an unambiguous statement about releasing source code or provide a link to a code repository.
Open Datasets	Yes	Datasets. CUB-200-2011 dataset [5] contains 200 bird subcategories with 11,788 images. The Stanford Cars dataset [16] contains 196 car models of 16,185 images. FGVC Aircraft dataset [23] is divided into first 50 classes (5,000 images) for training and the rest 50 classes (5,000 images) for testing. In Shop Clothes Retrieval (In-Shop) [21] contains 7,982 subcategories with 52, 712 images...
Dataset Splits	Yes	CUB-200-2011 dataset [5] contains 200 bird subcategories with 11,788 images. We utilize the first 100 classes (5,864 images) in training and the rest (5,924 images) in testing. The Stanford Cars dataset [16] ... is also similar to CUB, which is split into the first 98 classes (8,054 images) for training and the remaining classes (8,131 images) for testing. FGVC Aircraft dataset [23] is divided into first 50 classes (5,000 images) for training and the rest 50 classes (5,000 images) for testing.
Hardware Specification	Yes	Our model is trained end-to-end on one NVIDIA 2080Ti GPUs for acceleration.
Software Dependencies	No	The paper mentions using a Resnet-50 backbone and SGD optimizer, but does not specify software dependencies with version numbers (e.g., 'Python 3.8, PyTorch 1.9, and CUDA 11.1').
Experiment Setup	Yes	We train our models using Stochastic Gradient Descent (SGD) optimizer with weight decay of 0.0001, momentum of 0.9, and batch size of 32. The initial learning rate is set to 10^-5, with exponential decay of 0.9 after every 5 epochs. The total number of training epochs is set to 200.