Pay Attention to Features, Transfer Learn Faster CNNs
Authors: Kafeng Wang, Xitong Gao, Yiren Zhao, Xingjian Li, Dejing Dou, Cheng-Zhong Xu
ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section we provide an extensive empirical study of the joint methods of transfer learning and channel pruning. We evaluate the methods with 6 different benchmark datasets: Caltech-256 (Griffin et al., 2007) of 256 general object categories; Stanford Dogs 120 (Khosla et al., 2011) specializes to images containing dogs; MIT Indoors 67 (Quattoni & Torralba, 2009) for indoor scene classification; Caltech-UCSD Birds-200-2011 (CUB-200-2011) (Wah et al., 2011) for classifying birds; and Food-101 (Bossard et al., 2014) for food categories. |
| Researcher Affiliation | Collaboration | Kafeng Wang 1, Xitong Gao2 , Yiren Zhao3, Xingjian Li4, Dejing Dou5, Cheng-Zhong Xu6 1,2 Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences. 1 University of Chinese Academy of Sciences. 3 University of Cambridge. 4,5 Big Data Lab, Baidu Research. 6 University of Macau. 1 kf.wang@siat.ac.cn, 2 xt.gao@siat.ac.cn. |
| Pseudocode | Yes | The detailed algorithm can be found in Appendix A. In Algorithm 1 we illustrate the complete training procedure described above. |
| Open Source Code | No | The paper mentions using ResNet-101 from torchvision, but does not provide a link or explicit statement about the availability of their own source code for the proposed AFDS method. |
| Open Datasets | Yes | We evaluate the methods with 6 different benchmark datasets: Caltech-256 (Griffin et al., 2007)... Stanford Dogs 120 (Khosla et al., 2011)... MIT Indoors 67 (Quattoni & Torralba, 2009)... Caltech-UCSD Birds-200-2011 (CUB-200-2011) (Wah et al., 2011)... Food-101 (Bossard et al., 2014)... Image Net (Deng et al., 2009). |
| Dataset Splits | No | The paper mentions using a 'target training dataset D' and refers to 'validation accuracies' in the conclusion, but does not explicitly provide details about the validation dataset splits (e.g., percentages or sample counts) needed for reproduction. For example, 'The pre-trained model is then fine-tuned on the target training dataset D with the AFD regularization proposed in Section 3.3.' and 'Under a wide range of datasets, we demonstrated the smallest drop in validation accuracies under the same speed-up constraints...' |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU models, CPU types, or memory amounts) used for running its experiments. |
| Software Dependencies | No | The paper mentions 'torchvision' and 'Pytorch' (from the footnote), but does not provide specific version numbers for these or any other software dependencies. |
| Experiment Setup | Yes | For each benchmark dataset, the final FC layer of the network is replaced with a new FC randomly initialized with He et al. (2015) s method to match the number of output categories accordingly. We then perform transfer learning with 4 different methods... fine-tune the resulting model on Image Net for 90 epochs with a learning rate of 0.01 decaying by a factor of 10 every 30 epochs... At each step, we fine-tune each model using 4500 steps of SGD with a batch size of 48, at a learning rate of 0.01, before fine-tuning for a further 4500 steps at a learning rate of 0.001. |