Learning Deep Bilinear Transformation for Fine-grained Image Representation
Authors: Heliang Zheng, Jianlong Fu, Zheng-Jun Zha, Jiebo Luo
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We found that the proposed network achieves new state-of-the-art in several fine-grained image recognition benchmarks, including CUB-Bird, Stanford-Car, and FGVC-Aircraft. We conduct extensive experiments to demonstrate the effectiveness of DBTNet, which can achieve new state-of-the-arts on three challenging fine-grained datasets, i.e., CUB-Bird, Stanford-Car, and FGVC-Aircraft. |
| Researcher Affiliation | Collaboration | Heliang Zheng1 , Jianlong Fu2, Zheng-Jun Zha1, Jiebo Luo3 1University of Science and Technology of China, Hefei, China 2Microsoft Research, Beijing, China 3University of Rochester, Rochester, NY 1zhenghl@mail.ustc.edu.cn, 2jianf@microsoft.com, 1zhazj@ustc.edu.cn, 3jluo@cs.rochester.edu |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | And we defer other implementation details to our code https://github.com/researchmm/DBTNet. |
| Open Datasets | Yes | Datasets: We conducted experiments on three widely used fine-grained datasets (i.e., CUB-200-2011 [2] with 6k training images for 200 categories, Stanford-Car [3] with 8k training images for 196 categories and FGVC-Aircraft[33] with 6k training images for 100 categories) and a large scale fine-grained dataset i Natualist-2017 [34] with 600k training images for 5,089 categories. |
| Dataset Splits | No | The paper mentions using training images and fine-tuning, but does not provide specific dataset split information (e.g., percentages, sample counts for train/validation/test splits, or clear statements about a validation set used for hyperparameter tuning). |
| Hardware Specification | Yes | We use MXNet [35] as our code-base, and all the models are trained on 8 Tesla V-100 GPUs. |
| Software Dependencies | No | The paper mentions 'We use MXNet [35] as our code-base', but does not provide specific version numbers for MXNet or any other software dependencies. |
| Experiment Setup | Yes | We follow the most common setting in fine-grained tasks to pre-train the models on Image Net [36] with input size of 224 224, and fine-tune on fine-grained datasets with input size of 448 448 (unless specially stated). We adopt consine learning rate schedule, SGD optimizer with the batch size to be 48 per GPU. The weight for semantic constrain in Equation 4 is set to 3e-4 in pre-training stage and 1e-5 in fine-tune stage. |