Learning Deep Bilinear Transformation for Fine-grained Image Representation

Authors: Heliang Zheng, Jianlong Fu, Zheng-Jun Zha, Jiebo Luo

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We found that the proposed network achieves new state-of-the-art in several fine-grained image recognition benchmarks, including CUB-Bird, Stanford-Car, and FGVC-Aircraft. We conduct extensive experiments to demonstrate the effectiveness of DBTNet, which can achieve new state-of-the-arts on three challenging fine-grained datasets, i.e., CUB-Bird, Stanford-Car, and FGVC-Aircraft.
Researcher Affiliation Collaboration Heliang Zheng1 , Jianlong Fu2, Zheng-Jun Zha1, Jiebo Luo3 1University of Science and Technology of China, Hefei, China 2Microsoft Research, Beijing, China 3University of Rochester, Rochester, NY 1zhenghl@mail.ustc.edu.cn, 2jianf@microsoft.com, 1zhazj@ustc.edu.cn, 3jluo@cs.rochester.edu
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code Yes And we defer other implementation details to our code https://github.com/researchmm/DBTNet.
Open Datasets Yes Datasets: We conducted experiments on three widely used fine-grained datasets (i.e., CUB-200-2011 [2] with 6k training images for 200 categories, Stanford-Car [3] with 8k training images for 196 categories and FGVC-Aircraft[33] with 6k training images for 100 categories) and a large scale fine-grained dataset i Natualist-2017 [34] with 600k training images for 5,089 categories.
Dataset Splits No The paper mentions using training images and fine-tuning, but does not provide specific dataset split information (e.g., percentages, sample counts for train/validation/test splits, or clear statements about a validation set used for hyperparameter tuning).
Hardware Specification Yes We use MXNet [35] as our code-base, and all the models are trained on 8 Tesla V-100 GPUs.
Software Dependencies No The paper mentions 'We use MXNet [35] as our code-base', but does not provide specific version numbers for MXNet or any other software dependencies.
Experiment Setup Yes We follow the most common setting in fine-grained tasks to pre-train the models on Image Net [36] with input size of 224 224, and fine-tune on fine-grained datasets with input size of 448 448 (unless specially stated). We adopt consine learning rate schedule, SGD optimizer with the batch size to be 48 per GPU. The weight for semantic constrain in Equation 4 is set to 3e-4 in pre-training stage and 1e-5 in fine-tune stage.