TRP: Trained Rank Pruning for Efficient Deep Neural Networks

Authors: Yuhui Xu, Yuxi Li, Shuai Zhang, Wei Wen, Botao Wang, Yingyong Qi, Yiran Chen, Weiyao Lin, Hongkai Xiong

IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate the performance of TRP scheme on two common datasets, CIFAR-10 [Krizhevsky et al., 2009] and Image Net [Deng et al., 2009]. The CIFAR-10 dataset consists of colored natural images with 32 32 resolution and has totally 10 classes. The Image Net dataset consists of 1000 classes of images for recognition task. For both of the datasets, we adopt Res Net [He et al., 2016] as our baseline model since it is widely used in different vision tasks.We use Res Net-20, Res Net-56 for CIFAR-10 and Res Net-18, Res Net-50 for Image Net. For evaluation metric, we adopt top-1 accuracy on CIFAR-10 and top-1, top-5 accuracy on Image Net.
Researcher Affiliation Collaboration Yuhui Xu1 , Yuxi Li1 , Shuai Zhang2 , Wei Wen3 , Botao Wang2 , Yingyong Qi2 , Yiran Chen3 , Weiyao Lin1 and Hongkai Xiong1 1Shanghai Jiao Tong University 2Qualcomm AI Research 3Duke University
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks. It presents mathematical equations and descriptions of the training process but not in a pseudocode format.
Open Source Code No The paper does not provide any statement about making its source code available or a link to a code repository.
Open Datasets Yes We evaluate the performance of TRP scheme on two common datasets, CIFAR-10 [Krizhevsky et al., 2009] and Image Net [Deng et al., 2009].
Dataset Splits Yes For training on CIFAR-10, we start with base learning rate of 0.1 to train 164 epochs and degrade the value by a factor of 10 at the 82-th and 122-th epoch. For Image Net, we directly finetune the model with TRP scheme from the pretrained baseline with learning rate 0.0001 for 10 epochs.
Hardware Specification Yes We implement our TRP scheme with NVIDIA 1080 Ti GPUs. Our experiment is conducted on a platform with one Nvidia 1080Ti GPU and Xeon E5-2630 CPU.
Software Dependencies No The paper mentions 'cuDNN' but does not specify its version number. It also mentions 'SGD solver' which is a generic term without version details.
Experiment Setup Yes For training on CIFAR-10, we start with base learning rate of 0.1 to train 164 epochs and degrade the value by a factor of 10 at the 82-th and 122-th epoch. For Image Net, we directly finetune the model with TRP scheme from the pretrained baseline with learning rate 0.0001 for 10 epochs. We adopt SGD solver to update weight and set the weight decay value as 10 4 and momentum value as 0.9. The TSVD energy threshold in TRP and TRP+Nu is 0.02 and the nuclear norm weight λ is set as 0.0003. The TSVD energy threshold e is set as 0.005. λ of nuclear norm regularization is 0.0003 for both Res Net18 and Res Net-50.