Deep Learning for Fixed Model Reuse

Authors: Yang Yang, De-Chuan Zhan, Ying Fan, Yuan Jiang, Zhi-Hua Zhou

AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on five real-world datasets validate the effectiveness of FMR compared with state-of-the-art deep methods. In this section, we will demonstrate that FMR can replace the fixed models based on features z with a deep structure given raw features as inputs and provide highly-competitive results. In particular, we demonstrate these phenomenons on two real tasks, i.e., image classification and action recognition. In image classification task, two real-world datasets are tested and fixed models built on surrounding texts of images are acted as sophisticated models. The action recognition task is to recognize the human actions in short clips of videos and there are three real-world datasets in this task.
Researcher Affiliation Academia Yang Yang, De-Chuan Zhan, Ying Fan, Yuan Jiang and Zhi-Hua Zhou National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, 210023, China Collaborative Innovation Center of Novel Software Technology and Industrialization, Nanjing, 210023, China {yangy, zhandc, fany, jiangy, zhouzh}@lamda.nju.edu.cn
Pseudocode Yes Algorithm 1 Training Algorithm For FMR
Open Source Code No The paper does not provide any statement or link indicating that the source code for the FMR methodology is open-source or publicly available.
Open Datasets Yes WIKI (Rasiwasia et al. 2010) is a rich-text web document dataset with images, which has 2,866 documents extracted from Wikipedia; Flickr8K (Hodosh, Young, and Hockenmaier 2013) consists of 8,000 images that are each paired with captions; The HMDB51 dataset (Kuehne et al. 2011) has 51 action classes and 6766 video clips; The UCF101 dataset (Soomro, Zamir, and Shah 2012) has 101 action classes spanning over 13320 You Tube videos clips; The UCF50 dataset (Reddy and Shah 2013) has 50 action classes spanning over 6618 You Tube videos clips.
Dataset Splits No In classification tasks, 66% instances are chosen as training set and the remains are test set. While in action recognition, training and test splits are provided by (Lan et al. 2015). The paper refers to standard splits for specific datasets but does not explicitly mention a 'validation' split with specific percentages or counts.
Hardware Specification Yes We run the following experiments with the implementation of an environment on NVIDIA K80 GPUs server and our model can be trained about 290 images per second with a single K80 GPGPU.
Software Dependencies No The paper mentions software like Mat Conv Net (Vedaldi and Lenc 2015) and refers to hyperparameters from Krizhevsky et al. (2012) and normalization from Ioffe and Szegedy (2015), but it does not provide specific version numbers for any software or libraries used.
Experiment Setup Yes The hyperparameters are the same as used by (Krizhevsky, Sutskever, and Hinton 2012): momentum 0.9; weight decay 5 10 4; initial learning rate 10 2, which is decreased by a factor of 10. To prevent the internal covariance shift phenomenon, we normalize each channel of the feature map averaging over spatial locations and batch instances as in (Ioffe and Szegedy 2015), in addition, we use weight sharing technique in our network...