Web-Supervised Network with Softly Update-Drop Training for Fine-Grained Visual Classification

Authors: Chuanyi Zhang, Yazhou Yao, Huafeng Liu, Guo-Sen Xie, Xiangbo Shu, Tianfei Zhou, Zheng Zhang, Fumin Shen, Zhenmin Tang12781-12788

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on three commonly used fine-grained datasets demonstrate that our approach is much superior to state-of-the-art webly supervised methods.
Researcher Affiliation Academia Nanjing University of Science and Technology, China, Inception Institute of Artificial Intelligence, UAE Harbin Institute of Technology, Shenzhen, China, University of Electronic Science and Technology of China
Pseudocode Yes Algorithm 1: Softly Update-Drop Training
Open Source Code Yes The data and source code of this work have been made anonymously available at: https://github.com/z337-408/WSNFGVC.
Open Datasets Yes We evaluate our approach on three benchmark fine-grained datasets, CUB200-2011 (Wah et al. 2011), FGVC-aircraft (Maji et al. 2013), and Cars-196 (Krause et al. 2013).
Dataset Splits No The paper states, "We treat the retrieved web images as the training set and directly adopt the testing data from CUB200-2011, FGVC-aircraft, and Cars-196 as the test set". While it defines training and test sets, it does not explicitly mention a separate validation set or provide details on how data was split for validation, such as percentages or counts.
Hardware Specification Yes Experiments are conducted on two NVIDIA V100 GPU cards.
Software Dependencies No The paper mentions using "Adam optimizer" and "VGG-16" for initialization, but it does not specify version numbers for any software dependencies like programming languages, deep learning frameworks (e.g., PyTorch, TensorFlow), or other libraries.
Experiment Setup Yes We select the maximum drop rate τ from the values of {0.15, 0.2, 0.25, 0.3} and epoch Tk from the values of {5, 10, 15, 20}. Through experiments, we ultimately set τ = 0.25 and Tk = 10 as the default value on CUB200 and FGVC-aircraft datasets, and set τ = 0.20 and Tk = 10 on Cars-196 dataset. For model training, we follow (Lin, Roy Chowdhury, and Maji 2015) and adopt a two-step training strategy. Specifically, we first freeze the convolutional layer parameters and only optimize the last fully connected layer. Then we optimize all layers in the previously learned model. In the experiment, we use Adam optimizer with momentum=0.9. The learning rate, batch size, and epoch number in the first step are set to 0.001, 128 and 200, while in the second step, they are set to 0.0001, 64 and 100, respectively.